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Abstract 

Let X = (Xi)i >i and Y = (i*)i >i be two sequences of independent and identically 
distributed (iid) random variables taking their values, uniformly, in a common totally 
ordered finite alphabet. Let LCI n be the length of the longest common and (weakly) 
increasing subsequence of X\ ■ ■ ■ X n and Y\ - ■ - Y n . As n grows without bound, and 
when properly centered and scaled, LCI n is shown to converge, in distribution, to¬ 
wards a Brownian functional that we identify. 
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1 Introduction 

We analyze below the asymptotic behavior of the length of the longest common subse¬ 
quence in random words with an additional (weakly) increasing requirement. Although it 
has been studied from an algorithmic point of view in computer science, bio-informatics, 
or statistical physics (see, for instance, [CZFYZ], [DKFPWS] or [Sak]), to name but a 
few fields, mathematical results for this hybrid problem are very sparse. To present our 
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framework, let X = (Xj)j>! and Y = (Yf )$>i be two infinite sequences whose coordinates 
take their values in A m = {<*1 < ot 2 < ■ • • < ct m }, a finite totally ordered alphabet of 
cardinality m. Next, LCI n , the length of the longest common and (weakly) increasing 
subsequences of the words X\ ■ ■ ■ X n and Y\ ■ ■ ■ Y n is the maximal integer k G {1,..., n}, 
such that there exist 1 < ii <•••<*&< n and 1 < ji < ■ ■ ■ < jk < n, satisfying the 
following two conditions: 

(i) X is = Y js , for all s = 1, 2, ..., k, 

(ii) X tl < X i2 < ■■■ <X ik and Y h < Y j2 <■■■< Y jk . 

(Asymptotically, the strictly increasing case is of little interest, having m as a pointwise 
limiting behavior.) LCR is a measure of the similarity/dissimilarity of the random words 
often used in pattern matching, and its asymptotic behavior is the purpose of our study. 
This limiting behavior differs from the one of another better-known, measure of similar¬ 
ity/dissimilarity, namely, LC n , the length of the longest common subsequences of two or 
more random words. Indeed, after renormalization, the first result on LC n , obtained in 
[HI], reveals, under a sublinear variance lower bound assumption, a normal limiting law. 
In contrast, for LCI n , we have: 


Theorem 1.1 Let X = (Xj)j>i and Y = (Y))j> 1 be two sequences of iid random variables 
uniformly distributed on A m = {oq < a 2 < ■ ■ ■ < ot m }, a totally ordered finite alphabet of 
cardinality m. Let LCI n be the length of the longest common and increasing subsequences 
of Xi ■ ■ ■ X n and Y\ ■ ■ ■ Y n . Then, as n —>■ + 00 , 


LCI n — n/m 


vw 


m 


max mm | 

0=to<ti<-<i m _i<t m =l 
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where Bi and B 2 are two m-dimensional standard Brownian motions on [0,1]. 


The main motivation for our work has its origins in the identification, first obtained 
by Kerov [Ker], of the limiting length (properly centered and scaled) of the longest in¬ 
creasing subsequence of a random word, as the maximal eigenvalue of a certain Gaussian 
random matrix. When combined with results of Baryshnikov [Bar] or Gravner, Tracy 
and Widom [GTW] (see also [BGH]), this limiting law has a representation as a Brown¬ 
ian functional. Moreover, the longest increasing subsequence corresponds to the first row 
of the RSK Young diagrams associated with the random word and [Ker, Chap. 3, Sec. 
3.4, Theorem 2] showed that the whole normalized limiting shape of these RSK Young 
diagrams is the spectrum of the traceless Gaussian Unitary Ensemble (GUE). Since the 
length of the top row of the diagrams is the length of the longest increasing subsequence 
of the random word, the maximal eigenvalue result is recovered. (The asymptotic length 
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result was rediscovered by Tracy and Widorn [TW] and the asymptotic shape one by Jo¬ 
hansson [Joh], Extensions to non-uniform letters were also obtained by Its, Tracy and 
Widom [ITW1, ITW2].) Another motivation for the present study comes from the inter¬ 
pretation of the LCI„ functional in terms of last passage time in directed percolation. This 
is detailled in our concluding remarks. 

The asymptotic behavior of the length of the longest common and increasing subse¬ 
quences has actually already been investigated for binary words (m = 2) in [HLM], How¬ 
ever, the methods used there do not allow to consider an alphabet of arbitrary finite size m. 
When m = 2 with letters aq and ot 2 , it is enough to consider common subsequences made 
of a random number of common aq’s deterministically completed by the common a^’s, 
so that in a way the corresponding study is reduced to deal with only one type of letter. 
In contrast, when m > 3, the situation is much more complicated since a similar strategy 
reduced the problem tom-1 types of letter for which there is still, roughly speaking, too 
much randomness to successfully handle, in this way, the study of LCI n . A new method¬ 
ology based on a new representation of LCI n is thus required to deal with general finite 
alphabet of size m. This is achieved below where an appropriate representation of LCI n , 
that allows to investigate its asymptotic behavior for arbitrary m > 2, is obtained. Our 
results thus extend and encompass the binary LCI n result of [HLM], The dependence (or 
independence) structure between the two sequences of letters X and Y is carried over at 
the limit into a similar structure between the two standard Brownian motions B\ and B 2 . 
Hence, when X = Y, our results recover, with the help of [BGH], the weak limits obtained 
in [Ker], [Joh], [TW], [ITW1], [ITW2], [HL], and [HX], while if X and Y are independent so 
are Bi and B 2 . As a by-product of our approach, we further fix some loose points present 
in [HLM], As suggested to us, let us further put our main theorem in context. At first, for 
m = 2, the right hand-side of (1.1) becomes 


max min 

0<t<l 


£>P(1) - 


(!)/ 


- BW)), 


af (1) - aP(i) 

2 


(af(() - a^W) 


<i), 


In case the two-dimensional standard Brownian motions are independent, this last expres¬ 
sion has the same law as 

V"2 max min( 5 ,( 1 ) - ^B 1 (l),B 2 (t) - ^B 2 ( 1)^ , 

where, now, B\ and B 2 are two independent one-dimensional standard Brownian motions 
on [0,1]. Therefore, our limiting result matches the binary one presented in [HLM], 

Next, and still for further context, let us compare the asymptotic behavior of LCI„ to 
the one of, say, L n , the length of the optimal alignments which align only one type of 
letters. (In case of a single word, L n could correspond to, e.g, the length of the longest 
constant subsequences). Clearly, LCI^ > L n and under a uniform assumption, 

LCI n L n 1 . . 

lim -= lim — = —, (1.2) 

n— >+00 n n—>+oo n m 
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with probability one. Moreover, it is easy to see that, as n —> +oo, 


L„ — n/m 
\Jnjm 


mm 


1 - —J3.(l). V1 - —S=(l) | , 

m \m 


(1.3) 


for, say, two one-dimensional standard Brownian motions B i and B 2 . Now returning to 
(1.1), note that for j = 1, 2, 


ft-) 

i=l i= 1 

1 / m—1 \ m —1 

= - (m- l)B 3 m (l) - E B f f 1 ) + E< B f (* i) ~ Bfik- 1)) - B< ra) (i m _i),(1.4) 
171 \ i= l J i= 1 

where the random variable ((m — 1)-B™(1) — (1)) /m has exactly the same law 

as -y/l — 1 /rriBj (1). Therefore, the presence of the extra terms involving the t[s on the 
right hand-side of (1.1) allows to distinguish the renormalized limit of LCI n from that of 
L n and ensures that the latter limit is still almost surely dominated by the former. This 
observation should be contrasted with the non-uniform case where a single letter is attained 
with maximal probability p maX i and where L„ aligns this letter. Indeed, in view of (5.1), 
below, when centered by np max and scaled by ^/np max , both LCI n and L n converge to 

min(\/l -Pmax#i(l), \/l -Pmax5 2 (l)). 

A natural question arising from this study is the random matrix interpretation of our 
limiting ditribution (1.1). Another natural question is to interpret LCI n in terms of RSK 
Young diagrams and to investigate, more generally, the shape of a RSK counterpart of 
LCI n . Both questions go actually far beyond the scope of this paper but will be the 
subject of forthcoming investigations. 

As for the content of the paper, the next section (Section 2) establishes a pathwise 
representation for the length of the longest common and increasing subsequence of the two 
words as a max/min functional. In Section 3, the probabilistic framework is initiated, the 
representation becomes the maximum over a random set of the minimum of random sums 
of randomly stopped random variables. The various random variables involved are studied 
and their (conditional) laws found. In Section 4, the limiting law is obtained. This is done in 
part by a derandomization procedure (of the random sums and of the random constraints) 
leading to the Brownian functional (1.1) of Theorem 1.1. In the last section (Section 5), 
various extensions and generalizations are discussed as well as some open questions related 
to this problem. Finally, Appendix A.l completes the proof of some technical results and 
while Appendix A.2 gives missing steps in the proof of the main theorem in [HLM] as well 
as corrections to arguments presented there; providing, in the much simpler binary case, a 
rather self-contained proof. 
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2 Combinatorics 


The aim of this section is to obtain a pathwise representation for the length of the longest 
common and increasing subsequences of two finite words. Throughout the paper, X = 
(Xi)i >i and Y = (Yj),>i are two infinite sequences whose coordinates take their values in 
A m = {aq < a .2 <■■■ < ot m }, a finite totally ordered alphabet of cardinality m. Recall 
next that LCI n is the maximal integer k G {1,..., n}, such that there exist 1 < i\ < • • • < 
%k < n and 1 < j\ < ■ ■ ■ < jk < n, satisfying the following two conditions: 

(i) X ia = Y js , for all s = 1, 2,..., k, 

(ii) X tl <X i2 < -< X ik and Y jl < Y h <■■■ <Y jk . 

Now that LCI n has been formally defined, let us set some standing notation. Let N r (X), 
r = 1,..., m, be the number of ct r s in Xi, X 2 ,..., X n , i.e., 

n 

Nr{X) = #{i = 1,... ,n : X t = a r } = ^ 1 {x i= a r }, (2.1) 

i =1 

and similarly let N r (Y ), r = 1,..., m, be the number of ot r s in Yi, Y 2 ,..., Y n . Clearly, 

m m 

X N r (X) = Y, fifr(Y) = n. 

r =1 r —1 


Let us further set a convention: Throughout the paper when there is no ambiguity or when a 
property is valid for both sequences X = and Y = (Y;)j>i we often omit the symbol 

X or Y and, e.g., write N r for either N r (X) or N r (Y ) or, below, H for either Hx or Hy- 

Continuing on our notational path, for each r = 1,..., m, let Nf :t (X) be the number of 
ol t s in A s _|_i, As-|- 2 ) • ■ ■ i Xt, i.e., 

t 

A^(X) = #{i = S + l,...,f :Xi = a r }= l{x i=arh (2.2) 

with a similar definition for Nf^fY). Again, it is trivially verified that 

m m 

Y N r'rx) = Yw'‘x)=t-s, 

r =1 r =1 

and, of course, N® ,n = N r . Still continuing with our notations, let Tf(X), r = 1,... , m, be 
the location of the j ttl a r in the infinite sequence AR, X 2 ,..., X n , ..., with the convention 
that Tf(X) = 0. Then, for j = 1, 2,..., Tf(X) can be defined recursively via, 

T’(X) = min {s6N:s> Tf~\X),X s = ct r } (2.3) 
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where as usual N = {0,1, 2,...}. Again replacing X by Y gives the corresponding notion 
for the sequence Y = (Yj)j> 

Next, let us begin our finding of a representation for LCI n via the random variables 
defined to date. First, let Hx(ki, k 2 , . . . , k m -i) be the maximal number of a m s contained 
in an increasing subsequence, of X]X 2 ■ ■ ■ X n , containing k\ aqs, k 2 a. 2 s, ■ ■ ., k m -1 a m _is 
picked in that order. Replacing X = (Ar*)*>i by Y = (Y))j> i, it is then clear that 


mm 


in ( k\ + • • • + k m _i + Hx(ki ,..., h m _i), k\ + • • • + & m _i + Hy(ki ,..., fc m -i)), (2.4) 


is, therefore, the length of the longest common and increasing subsequence of X\X 2 ■ ■ ■ X n 
and Y\ Y 2 ■ ■ ■ Y n containing exactly k r ot r s, for all r = 1,2,... ,m — 1, the letters being 
picked in an increasing order. Hence, to find LCR, the function H needs to be identified 
and (2.4) needs to be maximized over all possible choices of hi, k 2 , ..., k m -\. 

Let us start with the maximizing constraints. Assume, for a while, that a single word, 
say, X] ■ ■ ■ X n , is considered. First, and clearly, 0 < k\ < N\. Next, k 2 is the number of 
ct 2 s present in the sequence after the k ^ oli. Any letter ol 2 is admissible but the ones 
occurring before the kf 1 cxi, attained at the location T 1 fcl A n. Since there are n letters, 

considered so far, there are thus N 2 ' Tl A " inadmissible ol 2 s and the requirement on k 2 writes 

k 2 < N 2 — N 2 ’ Tl An . Similarly for each r = 3,..., m — 1, k r is the number of letters ot r 
minus the inadmissible ot r s which occur during the recuperation, of the k\ otiS, followed 
by the k 2 ot 2 s, followed by the ot 3 s, etc in that order. Thus the requirement on k r is 
of the form k r < N r — N*, where N* is the number of cx r s occurring before the ki CKjS, 
i < r — 1, picked in the order just described. For r = 1, 2, and as already shown, = 0 

and A ^2 = N 2 ,Tl An . Assume next that, for r > 3, N*_ x is well defined, then N* is the 
number of ot r s occurring before, in that order, the k\ oliS, ..., the k r _i ck,..!®. A little 

moment of reflection makes it clear that the location of the k^_ 1 such a r _i is T r / 1 r ~ 1 , 
from which it recursively follows that: 


N* = N, 


rf k r —i+iV _-1 

0,T r Li An 


Remark 2.1 Note that N* as well as N* defined below in (2.8) actually depend on 
ki,, k r - 1 , but in order to not overload our notation we will omit this dependency there¬ 
after. 


Returning to two sequences X ±,..., X n and Yi,... ,Y n , the condition on k r , 1 < r < 
m — 1, writes as 


o <k r < yN r (X) - N;(X )) A ( N r (Y) - N*(Y )) . 
From these choices of indices and (2.4), 


< m— 1 


m—1 


LCl n = max min E + Hx{kh • • • j k m — i), ^ ^ ki “I - Hy{k I,, k m — i) J ? (2.5) 


. i=l 


2—1 
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where the outer maximum is taken over (fc l5 ..., /c m _i) in 

Cn ^ (&1? • • • ? km— l) • k\ G ^3 ^ ^n,3(^1? ^2)5 1 £ ^n,m— l(^l? • • • 5 k m — 2) j*, 

(2 - 6) 

where C„,i = {o < fc x < (X x (X) - X*(X)) A (Ad(£) - Nf{Y)) | and for i = 2,..., m - 1, 

c n ,i(k 1,... ,fc-0 = {o < fc, < (X,(X) - iv;(x)) A (V(£) - x*(£))}. ( 2 . 7 ) 


Next, observe that if T^ , _ 1 1+Nr 1 > n, then N r — N* = 0. Also, since the above maximum 
does not change under vacuous constraints, one can replace in the defining constraints, N* 
by N* recursively given via: = 0 and for r = 2,..., m — 1, 


n: = n, 


k r _ i +AT* 

0 ,T r ^ r 1 


( 2 . 8 ) 


The combinatorial expression (2.5) then becomes 


' m— 1 


m —1 


LCI n = max min £ ki T I? ■ ■ ■ ■ k m — i), ^ ( /cj T Hy(ki,... ,k 


m—l) I ) 


i=l 


i=l 


where the outer maximum is taken over (ki,..., k m - 1 ) in C n with C n and C Hj i, i = l,... ,m — 
1, respectively dehned as in (2.6) and in (2.7) but with N* replaced by N*, i = 1,..., m— 1. 
and, of course, 

m m 

i =1 i —1 

After this identification, recall that H is the maximal number of a. m after, in that order, 
the ki a^s, k 2 ck 2 s, ..., k m _i a m _!S. Counting the ct m s present between the various 
locations of the a*, i = 1 , ... ,m — 1 , and after another moment of reflection, it is clear 
that 

H = N m — R, 


where 


i-l N*+ki 


« = £ £ * 

i =1 i=Af*+l 


n 

m 



(2.9) 


and where the AT* are given by (2.8). Recall also that according to Remark 2.1, R depends 
actually on ki,, fc m _i but that for the sake of readability this dependency is omitted 
from our notations. Summarizing our results leads so far to: 


Theorem 2.1 Let X = (X,A>i and Y = (£«)*>! be two sequences whose coordinates 
take their values in A m = {cki < ck 2 < • • • < cx-m}, a totally ordered finite alphabet of 
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cardinality m. Let LCI n be the length of the longest common and increasing subsequences 
of X\ ■ ■ • X n and Y x ■ ■ ■ Y n . Then, 


( m— 1 


771—1 


LCI n = max min ^ h + N m (X) - R{X), J] h + N m (Y) - R(Y) , (2.10) 


i—1 


2=1 


where the outer maximum is taken over (&q ,..., k m _ i) in 


Cii ^ (fci, . . . , k m —f) . k\ G C U: i, /u2 € ('n, 2 {k\), k 3 G C n ^(k\, kf ), /c m ~l € C nm —1 k m — 2 ) [ 5 

(2.11) 

w/mre C n ,i = {o < A* < (IV^X) - IV*(X)) A (A/(Y) - JV? (Y)) } and /or i = 2,..., m - 1, 


C n ,i(A:i, • •. ,fci_i) = {o < A < (A/PO - iV*(X)) A (N t {Y) - X*(Y))}, (2.12) 

and where 

m— 1 

«=E E 

i=l J=W*+1 

with the various N’s and T’s given above by (2.1), (2.2), (2.3) and (2.8). 

The representation (2.10) has the great advantage of (essentially) only involving the 
quantities N t , N*, i = 1,2,..., m — 1 and T- , i = 1, 2,..., m — 1, j = 1, 2,..., and N m . 


3 Probability 

Let us now bring our probabilistic framework into the picture by first studying the random 

rpj ~ 1 rpj 

variables Nm ’ { , i = 1, 2,..., m — 1 and j = 1,2,... and then the random variables N*, 
i — 1, 2,..., m — 1, appearing in R in (2.9). 

Proposition 3.1 Let (Z n ) n >i be a sequence of iid random variables with P (Z\ = cxf) = p ir 
i = 1,..., m. For each i — 1,2,..., m, let Tf — 0, and let T- , j = 1,2,... be the location 
of the j th OLi in the infinite sequence (Z n ) n >i- Let i,r G (1,..., m}, with r ^ i. Then, 

for any j = 1,2,..., the conditional law of Nr 1 ’ 1 given (T-~ , T[), is binomial with 
parameters T/ — T/ _1 — 1 and p r j (1 — pf), which we denote by B(T'■ — T/ _1 — 1 ,p r /(1 — pff). 

Moreover, the conditional law of (X r i ’ i ) r=1 mr ^ i given (T-~ , T-), is multinomial with 
parameters Tf — Tf~ — 1 and (p r /(1 —Pi)) r =i,...,m,r^i, which we denote by M.ul(Tf — T^ 1 — 

l,(p r /(l — Pi)) r =i....,m,r^i) ■ Finally, for each * / r, the random variables ( N r i ’ 
are independent with meanp r /pi and variance (p r /Pi)( 1 + Pr/Pi)i and, moreover, they are 
identically distributed in case the ( Z n ) n >i, are uniformly distributed. 


. rpj ~ 1 rpj - •_-j^ •. rpj ~ 1 rpj 

Proof. Let us denote by jC( y N r i ’ i T/ _ , T-) the conditional law of N r i ’ 1 given 
Tf\Tl. Recall, see (2.3), that T/ 1 and T( are the respective locations of the (j — l) th 
and the a* in the infinite sequence (Z n ) n > i. Thus between T/ _1 + 1 and T/, there are 
T/ — T/ _1 — 1 free spots and each one is equally likely contain ot r , r ^ i, with probability 
Prj CC'-=' P() =p r /(l-Pi )• Therefore, 

£^i 


c(NZ 1 ’ Ti \p-\Ti)=B(ri-p- 1 -l,^P p -). (3.1) 

rpj — 1 rjnj 

Let us now compute the probability generating function of the random variables N r i ’ i , 
First, via (3.1) 


E 






rpj 1 rpj 

- - 

E 

E 

at i ’ i 

X r 

rrJ — l rpj 
± i ? i i 




(3.2) 


since T- is a negative binomial (Pascal) random variable with parameters j and pi which 
we shall denote BAf(j,Pi) in the sequel and T/ — Tf -1 is a geometric random variables with 
parameter pt, which we shall denote G{pj)- Therefore, 


E 




Var 





(3.3) 


rpj ~ 1 rpj 

In the uniform case, i.e., pi = 1/m, i = 1 ,...,m, the N r i ’ / j = l,...,m, i / r, 
j = 1, 2,... are clearly seen to be identically distributed, via (3.2). The multinomial part 
of the statement is proved in a very similar manner. The T/ — T/ _1 — 1 free spots are 
to contain the letters a r , r G { 1 ,. .. , m},r ^ i, with respective probabilities p r /(l — Pi)- 
Therefore, 


£ ((«T = Mul T> - If 1 - 1. 


Pr 


^ Pi J - 


. (3.4) 


/ T? _ ^ T'f \ 

Via (3.4), the probability generating function of the random vector ( N r i ' i ) r _ 1 mr ^ i is 
then given by: 


E 

i 

h 

1 

i_ 

= E 

E 

1 

•’4 

1 

^ Q>. 

1 

_1 



_r=l,r^i 



r =1 
_ry^i 
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/ \ £-1 
oo 

= £ 

1=1 

\r^i 

Pi 

-i ST^TTl 

1 — Z-^r=l,r^iPr X r 

As a direct consequence of (3.5) and for r ^ i, s ^ i, 


m 


Pr 


^ 1 

r =1 


-X r 


Pi 


Pi{l-Pi) 


i- i 


(3.5) 


Cov 





PrPs 



The proof of the proposition will be complete once, for each i ^ r, the random variables 
Ar i ’ ’, j > 1, are shown to be independent. First, note that given T- ~ , T?,Tf~ , X) , 

pj — 1 pj _ pk — 1 pk _ 

the random variables TV/ ’ * = S £ l^-i +1 l{jQ=« r } andJV/ ’ 4 = E? T fc-i +1 1{ x t =a r } 

are independent since the intervals [T/ _1 + 1,T/] and [Zf _1 + 1,2?] are disjoint, and since 
the {Xg)i>i are also independent. Moreover, recall that conditional distributions are given 
by (3.1), and so, for instance, 


c (tv? 


1 pj 

i 


\'T'3 ~ 1 7A? rpk— 1 



£ 


TV? 1,I? | 


7 r tf— 1 rrJ 
- L i i i 


B T: 


T? 


-i 


1, 


Pr A 
1 ~Pi)' 


Therefore, for any measurable functions f,g : M+ —> M+, and if 'Es( n , P ) denotes the expec¬ 
tation with respect to a binomial B(n,p) distribution then 


E 


./'(.v? ' 7 '>(.y?' 

E 

t j ':r: 


rpk— 1 rpk x 
r - i " 


= E 
= E 
= E 

= E 
= E 


f(N^ rl ^)g(N^~ 1 ^)\Tr\T^T l k - 1 ,T l k 


E 
E 

T '-ArJ 
fT? -T? -1 -1, Jte-) 

V 1 * 'i-Pij 

/(TV?" 1 ’ 7 ?] E?(TV ? 


/(tv?" %t ? \Tf~\ tv, x?-\xfl e Utv?“ 1 ,t ?It?- 1 ,t?,x?-\x? 


E 


T* 3 —pk — 1 _1 Pr ^ [^] 

V i i ’i-Pi 


pk — 1 pk 

± i i± i 
r 


(3,6) 


(3.7) 


where the equality in (3.6) is due to the conditional independence property, while the one 
in (3.7) follows from that 


E 


/->•(/■• y 


L 1 Pr 
’ 1 ~PiJ 


){f] = F(T’-Tt 1 ) and E,,^^ [a] = G(J? - J?" 1 ), 

7 V 1 1 :i -PiJ 
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for some functions F,G, and from the independence of Tf — T/ 1 and Tf — Tf 1 . The 
argument can then be easily adapted to justify the mutual independence of the random 

variables (Nr 1 ,Ti ) . □ 

With the help of the previous proposition and in order to prepare our first fluctuation 
result, it is relevant to rewrite the representation (2.10) as 


LCL„ = max min 

c n 


m— 1 

£ 

i =1 


m— 1 


k, + N m (X) - G„, m (X), + JV -( y ) - G„, m (r) 


(3,8) 


i= 1 


where 


G 


n,m 


m— 1 N*-\-ki 

£ £ 


i =1 j=W*+l 




(3.9) 


and where Pi(X ) = P(W = cKj) and Pi(Y) = P(Yi = Qj), 1 < i < m. Recall once more that 
G n ^ m actually depends on ki,, k m _i but that, for the sake of readability, this dependency 
is omitted from our notations, see Remark 2.1. 


Via (3.8) and (3.9), LCI n is now represented as a max/min over random constraints of 
random sums of randomly stopped independent random variables, except for the presence 
of N m (X) and N m {Y). Our next result also represents, up to a small error term, both 
N m (X) and N m (Y) via the same random variables. 


Proposition 3.2 For each i = 1,2,... ,m, and r ^ i, 


Nr = -N t + V 

Pi ^ 


rpj — 1 rpj 

N i ( Nr ’ 1 


Pr_ 

Pi 


IPr 


Pi 


j=l Pr[l + Pr\ n 


Pi 


Pi 


p 


1 + ^ n + 5, 


Pi 


?0) 


(3.10) 


where lim rw+oc /y/n = 0, in probability. In particular, for each r = 1,2 ,... ,m, 


i,r 


i =1 
i^r 


N r = np r + J2\l I ^( 1 + ff) n PiYs 


Pr 


Pi 


Ni Nr 


rpj — 1 rpj 

1 i _ Pr_ 

Pi 


j=1 \ — 1 + — 


Pi 


Pi 


£p4' 


(n) 
r ’ 


(3,11) 


n 


i= 1 
ijkr 


Proof. Let us start the proof of (3.10) by identifying the random variable S^} and show 
that, when scaled by y/n, they converge in probability to zero. Clearly, for i — 1,... ,m, 
i ± r, 


Ni . 

("} AT . /vVT T/ 


0 < sr> -Nr-J^ N 

3 = 1 
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In other words, S^) is the number of ct r in the interval [T* + 1, n], where T* is the location 
of the last cti in [1,77,]. Therefore, 


o < S<;> < n — T* — n - (T"‘ A n) 


■Ni 


(3.12) 


But, ¥(T* = n — k) = Pi( 1 — Pi) k , k — 0,1,..., n — 1 and F(T* = 0) = (1 — pi) n . Hence, 
for all e > 0, and n large enough, 


P|^>e)<P(n-H> eVn) < V P»(l - Pi) 1 < (1 - Pif^ —> 0. (3.13) 

In / n->+oo 

l=[esM 


Let us continue with the proof of (3.11). Summing over i = 1,..., m, i ^ r, both sides of 
(3.10), we get 




Pr 


i= 1 
2 /r 


p, 


i=l 2—1 

2 / 7 * 2 /r* 




Pr \ Pi 


iv, ('jvp-V.lv 

Pi 


Pi J Pr 


E 


\ 


i=i ./& ! + & 


Pi 


— i n i 

Pi) / 


+E -d?- 


2—1 

i^r 


(3.14) 


But, 50 On TV* = n, and so (3.14) becomes 


N r = np r + Y] ^1 -1 {^1 + y ) npj 

i^r 


^ Ni (Nr * 1,T * - ^ 

Pi 


E 


Pi 


j = l Hr l + Pr \ n 


/ ; Pi D j,r ’ 


Vi) 'V 


2 — 1 
2 /r 


which is precisely (3.11). 


□ 


Remark 3.1 For all i ^ r, lim n _> +00 E [(<F^) 2 / 77 '] = 0- Indeed, 

/»+oo /*+oo 

E[(^; } ) 2 H = / P((^) 2 >^ri)ch < J (l-Pi) [V ^dx 

/xn -'dx = 2 


'0 


"+00 


< / (1 - Pi 

Jo 


77,(1 -Pi)(ln(l - Pi)) 2 


Returning to the representation (3.8), the previous proposition allows us to rewrite LCR 


as: 


( m— 1 m—1 j m—1 

np m {X ) + E ^2 Pm(^) ^ ^ / -r^\ “I - H rn ^ n (yX^j + Y,MX)sQm, 

i=1 i=1 i=l 
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m— 1 


m— 1 


m—1 


np 7 


,{y) + E + H ^ Y ) + Ep;(m, n 2(^) , 

, i ; i pi Y > i =l / 

(3.15) 


where omitting the dependency in k\,..., k m -1 (see Remark 2.1), 


m —1 


= (i +^HE 


2=1 


Pi 


Pn 


Pi 




J ' =1 W?f ( 1 + ?f) 


n 


m— 1 


Ei wb + wb E 


2=1 


Pi 


Pn 


Pi 


JV* + fci f jV^V' >Pf _ Pm 


Pi 


i= Ar r+ 1 i / 2^ 1 + — n 


Pi 


Pi 


(3.16) 


We now study some of the properties of the random variables TV* which are present in 
both the random constraints and the random sums. The random variables TV* are defined 
recursively by (2.8) with TVf = 0. We fix k = (hi, ..., fc m _i) where Tq is the number of 
letters a:* present in the common increasing subsequences. The random variables TV*, i > 2, 
depend on k, actually TV* = N*(k ±,..., /q_i). We write 


2 — 1 

n; = E 

i=i 


(3.17) 


where TV*. = TV* j{kj) is the number of letters oti present in the step j < i — 1 consisting 
in collecting the kj letters atj, j < i — 1. (In the sequel, in order not to further burden the 
notations, we shall skip the symbols kj, j = 1,... ,i — 1, in TV* and TV*^.) The following 
diagram encapsulates the drawing of the letters: 


fci c*i 

N*^ cx 2 

N 3,l “3 


1 2 


&2 “2 
-^ 3,2 “3 


fcS+lVg 

3 


&3 “3 


k j _ 1 +N? 


N i,l “i 





fci_2+TVf_ 2 


fe 7 - 

T i 


~1+ N i-1 

1 


N i, i — 1 a i 


In Step j < i — 1, there are Tf ,+ 3 — T- 3 _^ + 3-1 letters selected but fcj letters are aq, 
TV* +1 ■ are aq + 1 , ..., TV*_, j are CKi_i, (for j — i — 1, there are also kj letters ctj but none 
of the others acj+i, etc). 

Moreover, there are Tj J+Nj — T ^f- 1+Jv j- 1 _ fc. _ TV* +lj - — • ■ ■ — N*_ tj possible spots 

(Tj ,+ 3 — T-/_^ + J_1 — kj in case j = i — 1) in which the probability of having a oti is 
Pi,j Pi /(1 — Pj ~ ■ ■ ■ — Pi- i)- Therefore, conditionally on 


g iJ (k) = a(N 




N* T" 


,T, 


13 










u. i-l/V* k -\-N* 

(the cr-field generated by N* +1 N*_ j 3 , T j f l J_1 , TV J ) it follows that 


N* - ~B[ T 

h3 \ 3 


kj+N* ^kj-i+N*^ 


rji J 

. 7-1 


k i - N i+U - N h,rPi,i)■ 


(3.18) 


The two forthcoming propositions respectively characterize the laws of N*- and of N*. 


Proposition 3.3 For each i = 2 ,m, the probability generating function of N*j, 1 < 
j < i — 1, is given by 

Pi 


E 


N* - 
X 


Pj +Pi- PiX 


(3.19) 


Therefore, N*j is distributed as — 1), where (Gf) i <e<kj are independent with geo¬ 

metric law G(pj/(pj + Pi)) and so, 


E[«y = f k l “nd Var(lV*,) = (l + g) 


(3.20) 


Proof. Recall that, for N ~ B{n,p), E[a: JV ] = (1 — p + px) n while, for N ~ G(p), 

kj+N* k. 

r 3 i — T ' 
j i- 


Efx^] = px/( 1 — (1 — p)x). Using (3.18), we then have for N = t^ 3+Nj — T fe l 1 1+JVj 1 


kj - N* +1J - N?_ lJ: 


E 


N* - 
X *>•» 


= E 

= E 
= E 


E[x^|iV] 


(1 - Pij + pijx) 3 


T k j+ N j _ T k i-i+ N j-i 


3 -1 


-ki-N* .- N? , . 

■7 7 + 1,7 *— 1.7 


.E/-V 


(3.21) 


setting y = (1 — + Pijx), and 


U := l$ i+N * -l$Ll 1+N *- 1 -k j ~ BM{kj,pj)*8- kj (3.22) 

r : = E Ki~s(v £ rvy)- (3 ' 23) 

r=j +1 r=j-t-l ^ ^ 

where for j = i — 1, we also set V — 0. The notation BJ\f(k,p) above stands for the 
negative binomial (Pascal) distribution with parameters k and p. The parameters of the 
binomial random variables V in (3.23) stem from that V counts the number of letters ot r , 
j + l<r<i — 1, between two letters ctj, while exactly kj such letters are obtained, so 
that each ct r has probability p r /( 1 — pf) to appear. Hence, 


E 


u-v 


= E 
= E 


E[y u ~ v \U] 

y u E[y~ v \U] 
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= E 


= E 


r i 


r=j +1 
2—1 


2—1 v^z— 1 TT 

,U ( , Pr ! Z^r=j+1 Pr \ t/ ~ 

1 — m fl — Vi)y J 


r=j+l F? r=j+l ^ 


Ci —1 


_Z, . 

since, from (3.22), U ~ X^ii(^ — 1), where the G £, 1 < ^ < kj , are iid with distribution 
0(Pj)- Finally, 


E 


y 


u-v 


Pi 


,1 - (1 - ft)((i - E y‘ +1 1^)!/ + EEj+i yfy 


Pi 


Xi + Pi - PiX 


since pij = pi /(l — YllJjPr) ■ The expressions for the expectation and for the variance in 
(3.20) follow from straightforward computations. □ 


Recall that by convention, N ^ = 0, and for 2 < i < m, the following proposition gives 
the law of N*: 


Proposition 3.4 For each i = 2,... , m, the random variables are indepen¬ 

dent. Hence, the probability generating function of N* is given by 


E 


N* 
X 1 


2—1 

n 


Pj 


= 1 \Pj + Pi- PiX 


(3.24) 


and so, 


2—1 


KM = E r k i ani Var W) = E ( 1 + V) f k f 


2—1 


Pi 


3 = 1 


Pi \ Pi 


Pi) Pi 


(3.25) 


Proof. In view of Proposition 3.3 and of (3.17), it is enough to prove the first part 
of the proposition, i.e., to prove that the random variables IV* -, 1 < j < i — 1, are 
independent. In order to simplify notations, we only show that IVfy and N* 2 are inde¬ 
pendent, but the argument can easily be extended to prove the full independence prop¬ 
erty. Since the Tf’s are stopping times, by the strong Markov property, observe that 
a (Ad,..., X kf) X cr(X k X ,..., X k2 +i v*) where, again a (Ad,..., X n ) denotes the a- 

1 rpk 1 1 I 

1 1 

field generated by the random variables Ad,..., X n , while X stands for independence con- 

T k 1 
1 1 

ditionally on Tf 1 . Moreover, Tf 1 and a[X kl ,..., X k 2 +N*) are independent, and thus so 

1 X 
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are ct(Xi, ..., X kl j and cr(X ^ ,... ,X k2 +N *). The independence of N* x and N* 2 be- 

i i ' 

comes clear, since N* { is a (Ad,..., A" /^-measurable while N* 2 is cr(A h 1 ,..., X k2 +N*)- 
measurable. The whole conclusion of the proposition then follows. □ 


4 The Uniform Case 


In this section, we specialize ours results to the case where the letters are uniformly drawn 
from the alphabet, i.e., pt(X) = pi(Y ) = 1/m, for all 1 < i < m. Hence, the functional 
LCI n in (3.15) rewrites as 


LCIn = max min I £ + H m , n (X) + X £ sg(.Y). £ + H m „(Y) + S^(Y) I , 


m— 1 


(")/ 


4=1 


n 

m 


m— 1 


A n )i 


1=1 


and therefore 

LCI n — n/m 

y/2n 


(4.1) 


= max mm 

Cn 


H m ,n{ X) 1 

a/ 2n m 

H m , n (Y) 


m—1 




m—1 


! - + 

\/2 n m 


v i=i 


(4.2) 


The following simple inequality, a version of which is already present in [HLM], will be of 
multiple use (see Appendix A.l for a proof): 


Lemma 4.1 Let a k ,b k ,c k ,d k , 1 < k < K, be reals. Then, 


max (a k A b k ) - max ((a fc + c k ) A (b k + d k )) 


< max (\c k \ V \d k \). (4.3) 

k=l,...,K \ / 


The previous lemma entails 


max mm 

C n 


H m ,„ (X) + _J_ ^ S („, + 1 


y/2n 


4= v sii(x), w stiv) 

V2n “ v 7 ^ m\/2n, ^ ’ 

H m , n {X) H m}n (Y) 


< 


1 


max mm 

Cn 
m—1 


V 7 2n ’ v 7 ^ 


E s S(x) 


4=1 


V 


m—1 


E s SA) 


4=1 


m\/2n 

But, from Proposition 3.2, as n —* +oo, both S^(X)/y/n 0 and S^(Y)/y/n -U 0, 

for all 1 < i < m — 1 (see (3.13)). Therefore, the fluctuations of LCI„ expressed in (4.2) 
are the same as that of 


max nun 

Cn 


H m , n (X) H m , n (Y) 
\/2n ’ \/2n 
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For uniform draws, the functional H mn in (3.16) rewrites as 


m— 1 


Hn 


1 


Ni 


IV, 




m— 1 


N*+ki rpj-l rpj 

• Ni ,T ‘ -1 


yv2ii-Y — w J,m 

tr “tr ^ tr 

^(rE4"f?)-Ek 


2—1 


2—1 


j=N*+l 

N* + k 7 


B® ( — 
V n 


where is the Brownian approximation dehned from the random variables N, 


Tt\Ti 


j > 1, which are iid, by Proposition 3.1, centered and scaled to have variance one, i.e., B, 

(4.4) 


is the polygonal process on [0,1] defined by linear interpolation between the values 

Z (i 


3 = 1 


where 


'JiJ ~ 1 rpj 

/ /v * * < — 1 

„P) _ JV m 1 


V2 ' 

Next, we present some heuristic arguments which provide the limiting behavior of 

m— 1 / ~\ t / m—1 


(4.5) 


1 

max mm | — 

i m 


(^) - v (4- (^±F) - (MN)) 


2=1 
m—1 


m 


2 > 

2=1 




n 

NiiY) 


n 


2=1 
m—1 


-ED 


2=1 


B.yf'WI + t'l .jayi® 


n 


n 


46) 


knowing that, by Donsker theorem, (F?^,..., F?n m_1 ^) (Fh 1 ),..., F?*” 1-1 )), n —* 

+oo, where {B^\ ..., is a drift-less, (m — l)-dimensional, correlated Brownian mo¬ 

tion on [0,1], which is also zero at the origin. The correlation structure of this multivariate 
Brownian motion is given by that of the Z^\ 1 < i < m — 1, which in turn is given by 

Proposition 3.1. Above, » stands for the convergence in law in the product space 

of continuous function on [0,1] vanishing at the origin. Since the multivariate Donsker 
theorem is crucial is our argument, we give a precise statement: 

Theorem 4.1 (Donsker) Let {Zf)j >i be iid square integrable centered random vectors in 
m > 2, with covariance matrix £. Let (F? n ) te [ 0)1 ] be the polygonal process defined, 
for each n > 1, by 

B n (t) = (nt ~J / ^ ) z H+1 , t e [0,1], 

k =1 


n 


n 


Then B n ^ where B is a Brownian motion on [0, l] m 1 with covariance ma¬ 

trix tT, and where Ando, 1 !)) s t an d s j or if ie convergence in law in the product space of 
continuous function on [0,1] vanishing at the origin.) 
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Proof. The multivariate Donsker theorem easily derives from the classical univariate 
one for which we refer, for instance to [Bil, Th. 8.2] and from the multivariate CLT as 

follows. Recall that the convergence B n ^ Co ^ 0,1 ^ ^ j s equivalent to the convergence 
of finite-dimensional distributions of B n to that of B and to the tightness of (B n ) n >i in 
(C'oQO, l])) m_1 . First, the multivariate CLT gives the convergence of the finite-dimensional 
distributions of (B„ (t), .. ., 1 ^(t))o<t<i with a covariance structure given by that of 
the z[ l \ 1 < i < m — 1. Second, the tightness of ( Bn\t ),..., B l m 1 ' ) (^)) 0 < t<1 is obtained 

from that of its coordinates: since Bn "* is tight for each 1 < i < m — 1 by the univariate 
Donsker theorem, for all e > 0, there is a compact Jl* of C 0 ([0,1]), the usual space of 
continuous functions on [0,1] vanishing at the origin, such that sup n>1 P(R^' ) ^ < e 

and we have 

m —1 

sup p((£«, ..., B^-V) 0 K x x ■ • • x 4-i) < sup V P(R« £ K,) < (rn - l)e, 

n> 1 ' ' n> 1 ^ 

~ ~ 2—1 

with K\ x • • • x K m _i compact of (C 0 ([0, l])) m_1 so that (BrP ’ X ,..., B^ n ~ 1 ' l ’ x ^ is tight in 

CUM)™- 1 . ^ ^ ' □ 


Heuristics 

Roughly speaking, there are three limits to handle in (4.6): 

1. The limit of the constraints in the maximum over C n ; 

2. The limit of the linear terms: B$’ X 

3. The limit of the increments: {^Bn kX ~ Bn > ' X ; 

and, similarly, for X replaced by Y. Below, the symbol indicates an heuristic replace¬ 
ment or an heuristic limit, as n —>■ +oo. 

First Limit (to be treated last, in Section 4.3): Since C n ^(ki, ..., /q_i) = {k = (k \,..., fc m _i) : 

0 < ki < min (A^(A") — N*(X), Ni(Y) — iV*(F))}, (and, again, with vacuous constraints 
in case either N*(X) > n or N*(Y) > n ) and from the concentration property of the N*, 
we expect (with again k 0 = 0, and to = 0, below): 

k = (k h ..., Avn-0 : 0 < ki < ^E[tV-(A)] - J2 k ^j A W)] - J2 k ? 
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Hence, for C n defined in (2.11): 


Cn^V 




) 


where V(pi,... ,p m -i) = {t = (H, • ■ ■ ,t m -i) ■ U > 0 ,i = 1,... ,m - 1,H < p i ,t l +t 2 < 

P2, . . . ,ti + ■ ■ ■ + t m — i ^ Pm— 1 } • 

Second Limit (see Section 4.1): For each i = 1,... ,m — 1, the random variables TVj are 
concentrated around their respective mean E[iVj] (= 1/m), and so 


Ni_ 

n 


m —1 

/ w E[iV*] and E 

2=1 



m— 1 m— 1 

2=1 2=1 



where the limit B$ C ' ) ^°' 1 j> B® is taken simultaneously. 

Third Limit (see Section 4.2): For each i = 1,..., m — 1, the random variables N* are also 
concentrated around their mean E[7V*] = X/=i kj, and so N* kj. Therefore, 

- BOX (Mh) - fj - B | 


0=1 
r 2 — 1 


s |i),x E H - -s'' 1 "'' E 


'3 i 


d =i 


o=i 


and similarly for A" replaced by E. Hence, 


LCI n — n/m 
\/2n 


max mm — 


1 m—1 / i N m—1 


V(l/m,...,l/m) \m z 

\ i=l 

m—1 


E s " 


Ira 


r 2—1 


-E B<i),x E‘4- B(i),v E ( 


'3 b 


m 


B (i)x 


2=1 


2=1 
m—1 


vi=i 


o=i 
f 2 — 1 


m —1 


max 

m o=u 0 <ui<-~<um-i<i \ m 

i 


E B “ ),y E f d E*j 

i=l V \J=1 / v=l 

m— 1 

min ( “ S 5W,A ( 1 ) - (B W ’ a (^) - H (i) ’ A '(Mi_i)), 

2=1 2=1 

772 — 1 772—1 \ 

B«' y ( 1) - V - BW’VO) , 


ra 


2=1 


2=1 


by Brownian scaling and the reparametrization Xp=i tj = u i/ m i i = 1, • • • m — 1, Uq = t 0 = 
0. In other words, 


LCI n — n/m 
sj2n/m 


772—1 


772—1 


max min - V H«’ A (1) - V (B (l) ’ X (u t ) - B^’ x ( Ul B) , 

0=72 0 <22l<...<2X m _l<l \ m ^ ^ V V ^ 


2=1 


2=1 
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m— 1 m— 1 

- ^ R W ’ y ( 1) - Y, (BW’ Y {ui) - F?«’ y (wi-O) 

777 / 

2=1 2=1 

Finally, a linear transformation and Brownian properties allow to transform the parameter 
space into the Weyl chamber 

W m ( 1) := {t = (t 0 , ti,... : 0 = t 0 < ti < ■ ■ ■ < t m - 1 < t m = 1}, 

and to replace the (m — l)-dimensional correlated Brownian motion B x (resp. B l ), by an 
m-dimensional standard one Bi (resp. B 2 ). Combining these facts, the expression on the 
right-hand side above, becomes equal, in law, to: 



max min 

tew m (i) 


1 m m 

-Ed'h) + E (M%) - sfWi)). 

2=1 2=1 


1 

m 


Esf(i)+v 

2=1 2=1 





which is the final form of our result, Theorem 1.1. In the sequel, we make precise the 
previous heuristic arguments. 


All along, we use different sets constraints. For easy references, we gather here the 
references to these notations: C n is defined in (2.6), C n> j(/ci,..., i) in (2.7), C n in (2.11), 
in (2.12), C* t in (4.22), C* in (4.23 ),'c n ,i above (4.29), Cj in (4.29), 

in (4.59). 


4.1 The Linear Terms 


Set 


m— 1 

R(X) = E 
2=1 






where again the dependency of R(X) in (£q,..., k m - 1 ) is omitted (see Remark 2.1), so that 
with the help of (4.6), (4.2) rewrites as: 


LCI ra — n/m 
\/2n 


m— 1 


= max nun | — 
m 


E ^ 


2=1 


N t (X) 


n 


R{X), 


l 

m 


772—1 



2=1 



+ op(l), (4.7) 


where, throughout, Op(l) indicates a term, which might be different from an expression to 
another, converging to zero, in probability, as n converges to infinity. 
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Next, by Lemma 4.1, 


max mm | — 

Cn 


1 

— max mm | — 

i m 


fe£Vff)-^x;V 

\ i= 1 i=1 


n J 


~ R(Y) 


(5«) - «w, I eV (™) - «(n 

i —1 2—1 , 


< max 

Cn 


mm — 


\ i= 1 i =1 


Vl\ 

n / 


- R(Y) 


- mm | — 

, m 


2—1 2—1 , 


< max max — 


m 




2—1 


1 

m 


n 

m —1 


2 — 1 


(4.8) 


We now wish to show that the right-hand side of (4.8) converges to zero, in probability. 
First note that for each 2 < i < m — 1, C n ^{k\ ..., /q_i) C {k = (ki, ..., k m _ i) : 0 < hi < 
min (Ni(X), Ni(Y)) } C {k = (hi, ..., k m _i) : 0 < k t < n}, and the same holds true for 
C n> i, see (2.11). But, ( Ni/n ) — Bn\M[Ni]/n), where we have dropped X and Y, does 
not depend on k. Therefore, the maximum can be skipped and the problem reduces to 
showing that, for all 1 < % < m — 1: 



as n —)• Too. This follows from the forthcoming lemma applied, for each i — 1,... ,m — 1, 

(l) / — l rpj . . — 

to the random variables Z- ’ = {NZi ’ i -l)/V2 , present in both (4.4) and (4.5) and 
which, by Proposition 3.1, are iid with mean zero and variance one. Note that the lemma 
below (see Appendix A.l for a proof) can indeed be brought into play since Hoeffding’s 
inequality, applied to the random variables AT,, ensures that for x n = yfri In n, 

lim P (|Ni - E[ATj]| > x n ) < lim 2e~ 2x * /n = 0. (4.10) 

n —>-+oo ' ' n—>-+oo 


Lemma 4.2 Let {Zf)j> i be iid centered random variables with unit variance, and for each 
n e N, let N hh be an N-valued random variable such that lim n _^ +00 P(|Ah n ) — E[A”^]| > 
x n ) = 0, where x n >t) is such that lim n _j , +00 x n /n = 0. Then, 

E T= Ao - 

where [A’( n \E[A’^]] is short for [min(AT^, E[AT^]), max.(N^ n \ E[Ah n )])]. 
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At this stage, (4.9) is proved and therefore, 


LCI n — n/m 
\pln 


in —1 


= max mm | — 
m 




1 

m 


R[X) 

i =1 ' ' 




2—1 


(4.11) 


finishing the first part of the proof of Theorem 1.1. Indeed, (Ni,... ,N m ) is multinomial 
with parameters n and (pi,... ,p m ). So, for uniform draws, E[Aj(X)] = E[iVj(Y)] = npt = 
n/m. Then, by the multivariate Donsker theorem, see Th. 4.1, and scaling, 


m— 1 


2—1 


A b [i) ' x 


/ E[^(X)] \ 


m 




n 




m— 1 

E 

2=1 


mwm 


B M’ x (i), n —* Too, 


(4.12) 


where (B^ ,x (t),..., B^ m 1 ' ) ’ X (^)) 0<i<1 is a (m — l)-dimensional Brownian motion and 
similarly for Y. As shown next, the covariance matrix of this Brownian motion at time t 
is tE = t(ak,i)i<k,i<m-i, where 



/ 1 

1/2 ... 

1/2 \ 

E = 

1/2 

1 1/2 

1/2 


V 1/2 

... 1/2 

1 / 


(4.13) 


Indeed, E in (4.13) is obtained as follows: First, since 


( B// ] ' x , ..., ( C, ° ([0,11))m y 1 ( B {1) ’ x , ..., , 

while uniform integrability (see Lemma 4.3, below) entails 

lim Cov (bW’ x ( 1), B^’ x ( 1)) = Cov (R«’ x ( 1), B^’ x ( 1)) = a k , h 


71—^+00 


Next, in the uniform case, Proposition 3.2 writes, for i = 1,..., m — 1, as 

N m = Ni + V^nB^ x (^j T o P (^), 
so that using also Remark 3.1 

Co v(sf x (^),S®’ x 0) = l-Cov(N m -N k ,N m -N l )+o( 1). (4.14) 

But (Ni,..., N m ) ~ Mult ( ti , (^,..., ^)), Aj/n -T 1/m, Var(Aj) = n(m - l)/m 2 and 
when, i ^ j, Co v(Ni,Nj) = —n/m 2 . Therefore, 


1 

2 n 


Cov (N m - N k ,N m 


Ni) 


1 rn(m — 1) 
2 n v m 2 


n 

n 

n 

m 2 

m 2 

m : 


1 

2 m 


(4.15) 
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Since by Lemma 4.3, lim n _>. +0O E [ (B^ ( N k /n) — (l/m)) 2 ] = 0, it follows 


lim Co v(Bi k) ' x 

n—>-+oo \ 



lim Co 

n— l+oo \ \m/ 



cov("s«’ x UV B<‘W-h 

V \mJ \ m / / 

— Cov fsW’ x (l),sW’ x (l)V 
m V / 


(4.16) 


Finally, (4.14), (4.15), (4.16) ensure the expression (4.13) for the covariance. To finish, let 
us state a lemma, just used above and, whose proof is presented in Appendix A.l. 


Lemma 4.3 The sequences {B^\N k /n) 2 ^ and (B^\l / m) 2 ) n>l , k — 1,... ,m — 1, are 
uniformly integrable and 


lim E 

Tl —^-|-00 


V n / \mJ 


= 0. 


(4.17) 


4.2 The Increments 


In this section, we compare the maximum of two different quantities over the same set 
of constraints in order to simplify the quantities to be maximized (before simplifying the 
constraints C n themselves, in the next section). The quantities to compare are: 


max 

fceC„ 


and 


771—1 


771—1 


rn 


E BJPfoPO) - E a 


i— 1 
771—1 


i =1 
771—1 


n 


m 


V B«’ y (p,(y)) - V 


Z=1 


Z=1 


n 

n 



max 

fceC„ 



m—1 m—1 

E BjP'MA')) - V 


2=1 


2=1 





A 


1 

m 


m—1 


m—1 


b^ y ( Pi (y)) - E 





(.4.19) 


Using (4.3) in Lemma 4.1, their absolute difference is upper-bounded by 


max 

k£C n . 


m—1 / 

E ( B T V 

i= i ^ 


/ A?(.Y) + PA 





m—1 


-E 





l n 
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V 


),Y ( Ej=l 


771—1 


EK 


B 


2—1 


n 


(i),y I Ej=i k i 

n 


< max 

k£Cn 


max 

keCn 



B 


EL *; 


(i),y ^j=i J 


n 


e;=i a* 


n 


Recall that TV*(X) = JV*(y) = 0. Hence, for i = 1, 


Bi 


(i ),x { N;(x) + k< \ _ g(i) , x [ TUB ] = R(i) ,x f mx) 


n 


= Bl 


n 


n 


SB X ( ] = 0, 


with the same property for functionals relative to Y. Therefore, we are left with investi¬ 
gating terms of the form 


max 

fceC„ 


£(i),x ( ^m±h ) _ B (0,x fTkh 
' \ n / n l n 


V 


B 


W ,y ^00 + ** ) _ B ( 0 ,y f gkk 


, (4.20) 


and 


max 

kec n 




N'(X) 


n 


bp- x i YAB 


n 


V 


B «),V 

n 


N'(Y ) 


n 


H, 


(i),y | Ej=i % 

n 


(4 -2i) 

for 2 < i < m — 1. Above, all the quantities considered only depend on a single sequence, 
say X or Y, except for the constraints in C n which depend on both X and Y. However, 

C n ,i(k u ..., A:*-!) C CY(X) := {k = (k 1 , ..., kn-x) : 0 <h< N 2 (X) - N*(X)} (4.22) 

(resp. C n> i(ki, ..., ki-i) C C* i(Y)), adn the same for C n> i) and so upper-bounding, in (4.20) 
and (4.21), the inner maxima by sums and the maxima over C by maxima over 


771—1 


C*( X) := P| CX(X), 


(4.23) 


2—1 
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(resp. C*(y)), we are left with investigating, for 2 < i < m — 1, the convergence in 
probability of terms of the form 


max 

fceC*(W) 



f N !( x ) + fa t 



(4.24) 


and 


max 
fee c*(X) 




(4.25) 


and, similarly with X replaced by Y. Omitting the reference to either X or Y, the terms 
to control are, from (4.4) and for each, 2 < i < m — 1, of the form: 


max 

fceC* 


N*+ki „(j) 

^ \fn 

j — k\-\ - \-ki~\-l 


(4.26) 


and 


max 

fceC* 



j — ki~\ - \-ki— i+l 


where the 


zf 


, j > 1, are 


defined in (4.5) and where 


(4.27) 


m— 1 

c; = f| C,i, with c;,i = {* = (fa,.... fan— i) : o < h < N, — JV*}. 

2=1 

In (4.26), (4.27) and henceforth, we write Y^L ni regardeless of the order of n i and n 2 , i.e., 
by convention this sum is YljL n2 w h en n 2 < ^i- 

Since (4.27) is similar, but easier to tackle than (4.26), we only deal with (4.26). Again, 
as in Section 4.1, let D l n = { |iVj — E[7Vj] < y'nlnn} for i — 1,2,..., m — 1, and, thus, for 
£ > 0 , 


P 


max 

fcec* 


N*+ki 

E 


j — ki~\ - \-ki~\-l 




< P 


r 

N*+ki 

1 

m—1 \ 

max 

fcec* 

E z f 

> E\Jn 



j — ki~\ - \-ki ~\-1 

> 

i =1 / 


+ 


m— 1 

E 

2=1 


p«Di) c ). 


(4.28) 


Let C l n 1 = 0)=! {kj — E[Ay] + y/nlnn] and let Cf t be the set of indices hi,... ,ki 

c*i = c 1 - 1 n {e[n*} <e i = k 1 + --- + k i < e[a,] + V^in n - {n* - e[tv*])} ( 4.29) 
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where we set ti := k\ + • • • + Since under f]"f :1 X D l n , C* ni C {kj < EfiVj] + \pri In n} and 
since Proposition 3.4, specialized to the uniform case, gives E[iV*] = ^ follows 

that C* C Cjfj and (4.28) is thus further upper-bounded by 


( 

— (fci-f- \-ki— i) 

\ 

max 

\k&c*. 

£ z f 

3=ii +1 

> e\fn 


m —1 

+ £ p (cd‘)4 

i =1 


(4.30) 


Now, in view of (4.10), it is enough to show the convergence to zero of the first term on 
the right-hand side of (4.30). To do so, set E^ — fl and, for 2 < i < m — 1, 


K(h, • • •, h-x) = \NtiK ..., ki_i) - E [Nfih ,..., h.,)} I < x r 


with 

and let 


x n = \fri\xin, 

k= n Kiku-.-M-iY 


(4.31) 

(4.32) 


Our next goal is to show that asymptotically, E l n has full probability. 

Proposition 4.1 Let 2 < i < m — 1, then lim n _ s . +00 P((T^) C ) = 0. 

In order to prove Proposition 4.1, we first need the following technical result, proved in 
Appendix A.l: 

Lemma 4.4 For x G [— n, +oo), let 


K n {x) = 


(x + 2 n) x+2n 


(2x + 2n) x+n (2n)' 
Then, for some constants c, C G (0,+oo), 


( ■ (\ x \ 

x 2 \ 

—cn nun — 


V \ n 

n 2 / 


(4.33) 


We proceed now to the proof of Proposition 4.1: 
Proof. (Prop. 4.1) Clearly, 


p«sy°) < £ p«£i(fei.fci-or) 

< n l ~ l max F((E l n (ki, ..., fci-i)) c ). 

Therefore, to prove the lemma, it is enough to show that: 

lim n l ~ l max F((El l (ki,...,k i - 1 )) c )=0. (4.34) 

”^+°° (fci.Jfei-a) ect 1 
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Now, for each 2 < i < m — 1, Propositions 3.3 and 3.4 assert that, 


i— 1 


N; = N;(k u ...,h- l ) = 'jT N h’ 

3 = 1 

where the (N*j)i<j<i-i are independent and with probability generating function 


E [x N u] = 


2 — x 


Next, 


P(K(k u ...,k l . 1 )r) = P(|JV*-E[Af*]| >x n ) 


' i—1 


' i—1 


P EKf - k i) > x " + p Eft - N <j) > *» ■ < 4 - 35 ) 


d=l 


d =1 


The first term in (4.35) is bounded by Q kl+ ... +k . (x n ), where 


Q r k (x) := min ^exp (— ( t(x + k) + k In(2 — e f ))) j 

(x + 2 k) x+2k 
(2x + 2k) x + k (2k) k ' 

since the minimization in (4.36) occurs at t — In ((2a; + 2k)/(x + 2k)). 
The second term in (4.35) is bounded by Q l kl+ „. +k (x n ), where 


(4.36) 

(4.37) 


® l k (x) := min ^ exp (— (t(x — k) + k ln(2 — e *))) 'j 

(2k — x) 2k ~ x 
(2k - 2x) k ~ x (2k) k ' 


(4.38) 

(4.39) 


observing that, for x < k, the minimization in (4.38) occurs at t — In ((2k — x)/(2k — 2x)). 
From the previous bounds and (4.35), it is clear that (4.34) will follow from 

limn* -1 max 0' 1+ ... +jt (a; n ) = 0, (4.40) 

n->+°o (k 1 ,...,k i - 1 )er n ~ 1 


for • e {l, r}. To obtain such as limit, we make use of Lemma 4.4, with x = x n = s/n ln(n), 
noting also that C l ~ l C {(fci,..., ki- 1 ) : k\ + • • • + ki -1 < E[7Vj] + (i — l) A /nln?rj C 

{(h, • • •, h- 1) : ki H-b h -1 < (i - l)(max j= i v .. ji _i pfn + y/nIan)}. 

First, for • = r, when &! + ••• + i < x n , (4.33) writes as 


0 


&i H- \~ki— 


(a; n ) < C exp c(/ci + • • • + fcj_i) min^ 




fci + • • • + ki— i \fci + • • • + 
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C exp(—cx n ), 


so that 


n l 1 max 0! , +k Ax n ) < Cn l l e c ^ lnn —* 0, n —> +oo, 

k 1 +-+k i - 1 <x n fel+ ” +fe “- lV ’ ~ 

where above, and below, C is a finite positive constant whose value might change from 
a line to another. For x n < k\ + • • • + ki _i < (i — l)(nmaxj =li + \/n\i\n) = 
(i — l)(n/m + i/nlnn), (4.33) writes as 

£ Cexp(- c(t, + ■ ■ ■ + + *" + ^ ^ ))) 

= C exp ( — c- 


xt 


< C exp ( — c— 


k\ + • • • + kj -1 

„2 


x; 


(i — 1 ){n/m + y'nlnn) / ’ 


so that 


1 max @ r kl+ ... +ki _ 1 (x n ) 

x n <k iH- \-ki—i<(i—l)(n/m-\-y/n\nn) 


< n l 1 exp I —c— 


n(ln n) 


i — 1 )(n/m + yjnlnn) 


—> 0, n —>• +oo, 


guaranteeing (4.40) with • = r. 

Next, let • = l and consider the following three cases: ki + ■ ■ ■ + k t -\ < x n /2, x n /2 < 

k\ + • • • + ki -1 < x n and x n < k\ + • • • + ki -1 < (i — l)(nmaxj = i i_i pj + \/n In rt) = 

(« — l)(n/m + y/nlnn). When k\ + ■ ■ ■ + k % -\ < x n /2, (4.38) ensures that for all t > 0: 


® l kl+ ... +ki _ 1 (x n ) < exp (t(ki H-t- fei_i - x n ) - (h 4-b fcj_i) ln(2 - e 4 )jj 


< exp l — —x. 


(4.41) 


When x n /2 < k\ + • • • + < x n , (4.38) ensures that for all t > 0: 

@ l kl+ ...+ ki _ 1 (x n ) < exp (t(ki H-b — x n ) — (k\ H-h fci-i) ln(2 - e _i )jj 


x r 


< exp ( - J ln(2 - e-*; 


(4.42) 


When x n < ki + ■ ■ - + ki-i < (i — l)(n/m + i/nlnn), (4.39) and (4.33) in Lemma 4.4 ensure 
that: 


0 ? 


H- \-k' 


-^Xn) < C exp^— c(ki H-f fcj_i)min^— 

= C exp ( — c 


X r , 


X r , 


+ • • • + ki -1 \k\ + • • • + ki -1 


/?! + ••• + ki —i 










(4.43) 


< C exp 



n(ln n) 


1) ( n/m + a Jn In n) 7 


Gathering together the bounds (4.41), (4.42) and (4.43) proves (4.40), for • = l. Combining 
this last fact with the corresponding result for • = r, and via (4.40) and (4.34), proves 
Proposition 4.1. □ 


Now, thanks to Proposition 4.1, to prove the convergence to zero, as n —> +oo, of the first 
term on the right-hand side of (4.30), it is enough to prove the same result for 


P 



— (fciH- \-ki— i) 

E 


zf 


l=h+1 



(4.44) 


where the Z ( p are given in (4.5), i.e., Z^ ] = (Nm ' Ti — l)/ a/2, % — 1,..., m — 1, j > 1. 
Our next elementary proposition, the ultimate before closing this section, provides tail 
estimates on the partial sums of the Zj (omitting the indices i for a while). 


Proposition 4.2 Let (Zj)j >i be iid random variables as in (4.5). Then, for suitable pos¬ 
itive and finite constants c and C, all x > 0, and all positive integer k, 


K 

P {^^Zj > x j < min^exp^— (t(x\/2 + k ) + Hn(2 — e*)jj j =: Q r k (xV2), (4.45) 


1=1 
k 


F (J2 Z i- x ) - C'expf -cmin(|, 


3 =1 


fc’ \k 


(4.46) 


K 

P^ ^ Zj < —x'j < min^exp^— (t(x\/ 2 — k) + k In(2 — e_i )J)) =: ©[.(ict/2),(4.47) 


t=i 

k 


F(^^Zj<—x) < C exp ( — ernin ( j, ) ), for x < k. 


l=i 


k \k 


(4.48) 


Proof. Recall from (4.5) that Zj = (Nm ,Ti — 1)/\/2 , i ^ m, and from (3.2), 


E 


Hence, using the notation in (4.36), 


x 


N, 


rpj 1 rj-ij 


2 — x 


(4.49) 


l=i 


- x ) - ™ ( e - i(x ^ +fc) E^exp (tNm 1,T/ ) A ] = ©fc^v 7 ^), 
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and (4.46) follows from (4.36) and (4.37) in (the proof of) Proposition 4.1 (with its notation) 
and from (4.33) in Lemma 4.4. Similarly, using the notation in (4.38) 


k 

p( 

j =i 




-x 



9 i(aV 2 ). 


which is (4.47). As previously observed via (4.39), when x < k, the minimization for Q l k {x) 
occurs at t = In ((2 k — x)/ (2 k — 2x )), and, once again, (4.33) on Lemma 4.4 ensure (4.48). 
□ 


We are now ready to move towards completing this section. From its very definition 
in (4.29), 

C* ti = C 1 - 1 n {E[N*} <£ i = k 1 + --- + k i < E[W] + V^hi n- (N* - E[1V;])} 


i —1 


c 


&! + ••• + ki_i < E [Nj] + (i — l)Vn\wn 


l=i 


n {E[N*} <e i = k 1 + --- + k i < E[W] + Vnlnn - (N* - E[7V*])} 

C < ki + • • • + i < (i — l)(n max p, + V nln?r) 

( l=i,. 

n {E[N*} <e i = k 1 + --- + k i < E[W] + Vnlnn - (N* - E[7V*])} . 

Therefore, recalling also from (3.25) that E[AT*] — k\ + -h (4.44) is upper bounded 

by: 





— (ki~\ - \-ki— i) 

1 

\ 

p 


max 

E z i 

> £\fn 

he; 


u 

fcH- l~ki—i<(i—l)(n/m+y/nlnn) 

K k H-ffei_i<4<E[JVj]+A/nlnn-(iV* —(fei4-hfci-i)) 

1=4+1 

J 

) 


< P 


< P 


/ 


max max 

fciH- \-ki_ i<(i— l)(n/m+\/nlnn) \rii\<x , 

\fcH-ffci-i<4<E[iV,]+y'nlnri+a!„ 

4+«; 


max max 

■£;<E[A r i]+ v / nlnri,+;r n \ni\<x„ 


< 3 nx n max P 

ti<¥\Ni]+y/n In n+x„ 
\rii\<Xn 


E 4 

l=A+i 

t-i+rii 

E 4 


£i 

E 4 

j=£i+l 

> £a Jn 


\ 


> Ey/n 


> eWn 


< 3 nx n max 

4 <E[W ; ] + y/ji In n+Xr, 
0 <m<x n 


©^(ev 7 ^) + 0O i (£\/2n)) , 


(recall (4.32)) 


(4.50) 
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where, in the next to last inequality, we used the usual (sharp in the iid case) bounding 
of the maximum via the number of terms times the maximal probability; while in the last 
one, In*| < x n was changed into 0 < n* < x n = \fri In n. 

Our final task is to show that 


lim nx n max ©*(£v2 n) = 0, (4.51) 

n-s>+oo 0<rn<x n 1 


for • G {l,r}. This relies again on Lemma 4.4 and Proposition 4.2. For • = r, when 
k < £\/2n, (4.45) and (4.46) entail that, 

Q r k (e\/2n) < Cexp(— ceV^n)', (4-52) 

while, for £\/2 n < k < x n , they entail that, 

Q T k {ey/2n) < Cexp ( — 2ce 2 n/k) < Cexp ( — 2ce 2 n/x n ) = Cexp ( — 2ce 2 ^/n/ Inn). (4.53) 

Therefore, for • = r, (4.51) follows from (4.52) and (4.53). Let us now turn our attention 
to • = l. When e\phi < k < x n , (4.48) entails that, 

0 , fc (£V / 2n) < Cexp ( — 2ce 2 n/k) < Cexp ( — 2ce 2 n/x n ) = (7exp ( — 2ce 2 ^/n/ Inn). (4.54) 

For k < e-sJn/2, (4.47) entails that, for any t > 0, 

Q l k (e\/2n) < exp (t(k — e\[2n) — k ln(2 — — ex P ^ — et-\/n/2j. (4.55) 

For E\Jn/2 < k < e\/2n, (4.47) entails that, for any t > 0, 

Q l k (e\/2n) < exp (t(k — E\Jnj 2) — Hn(2 — < exp ^ — et\Jnj2 In(2 — e~ 4 ) j. (4.56) 

Therefore, for • = l, (4.51) follows from (4.54), (4.55), and (4.56). Gathering all the 
intermediate results, for any i = 2,... ,m — 1, 


lim P 

n—>•+oo 


max 

fceC?.. 


— (fciH- \-ki— i) 


E 

j=ti +1 


r(i) 



o, 


and therefore, 



( 

N i + ki y(i) 

\ 

lim P 

max 

E ‘r 

>£ ) 

n—>•+oo 

fcec* 

^ \fn 

/ 



j—k i4 \-ki~\-l 

/ 


The goal of this section has thus been achieved: the quantities (4.18) and (4.19) have the 
same weak limit. 
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4.3 The Constraints 


To deal with the third heuristic limit, we now need to obtain the convergence of the random 
set of constraints towards a deterministic set of constraints. This fact will follow from the 
various reductions obtained to date as well as new arguments developed from now on. To 
start with, let us recall two elementary facts about convergence in distribution. 

The first fact asserts that if ( f n )i<n<oo is a sequence of Borel functions such that x n —> x^ 
implies that f n (x n ) —> /oo(^oo), and if (A" n ) n > 1 is a sequence of random variables such that 
X n A"oo, then /„( X n ) /oo(-^oo)- Indeed, via the Skorohod representation theorem for 

Co([0, l])-valued random variables, there exists a probability space and Co([0, l])-valued 

c, 

random variables Y n , 1 < n < oo, such that Y n — X n , 1 < n < oo, and Y n —> Y^ with 
probability one. But, by hypothesis, f n (Y n ) —> fooO^oo), with probability one. Therefore 
fn(X n ) => faoiXoo). 

The second elementary fact is as follows: Let (A n )„> 1 be a sequence of random variables 
such that X^ =>- Y, then X n =>- Y, where x + = max(a:, 0) and x~ = min(a;, 0). Indeed, 
P(X+ < x) < P(X n < x) < P(X“ < x), for all 

Using these two elementary facts, let us return to our derandomization problem. Re¬ 
calling (4.19), and using the polygonal structure of the processes B* and B\ , we have 

M “ = ?i?X F 4 B * k n) AFY ( Blk n))’ 

where 

^ 771—1 771—1 / i i~ 1 \ 

Fx ^ = m ^ Ui ^ X ^ " ( Ui ( _ Ui ( **) ) ’ ( 4 - 57 ) 

i =1 i= 1 \ 3=1 3=1 J 

^ 771—1 771—1 / 1 i— 1 \ 

Fy ( U ;*) = — u i(Pi( Y )) - [ Ui (S - Ui (S> ( 4 - 58 ) 

1 i= 1 i =1 \ j=\ j= 1 / 

for u = (ui,.. .,u m - 1 ) e (C 0 ([0, l]))” 1 1 and t = (U, ■ ■ ■ dm-i) e [0, l] m_1 . Now, let 

= jfc = {ki)i<i<m-i i = 1,... ,m - 1,0 < ki < 7i and ^ ^ < Vi ± 2x n | , (4.59) 
with = i/nln(n) as in (4.31), and let 

M » = “g ( Fx ( B “ ’ s) A ( B ” ’ *)) ■ < 4 - 6 °) 


Since 


K(X) - N’(X) = n Pi - £ h + {(Ni(X) - E[JV f (X)]) - (N>(X) - E[iV*(A)])), 

3 =1 
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with a similar statement replacing X by Y, the condition 


h < (N,(X) - N,'(X)) a (N,(Y) - N-(Y)), 
in the definition (2.11)-(2.12) of C n , writes as ki/n < Pi + R l n (X, Y)/n where 

K(X,Y) = ((iV i (X)-E[Af i (X)])-(Af*m-E[Af*mD) 

A((jv,(y) -E[Af,(y)]) - (N’(Y) - E[iv*(y)])). (4.ei) 

Now let 

m —1 

F n -.= fl {\Ni — E[iVj]| < x n } n E^, 

i=l 

with E l n defined in (4.32). From (4.10) and Proposition 4.1, we have lim n ^. +00 P(i^) = 0 
and, on F n , R l n (X,Y) < 2x n , for all 1 < i < m. Therefore, when F n is realized, C n in 
(2.11) is encapsulated as follows: C~ C C n C C+ , and 



M~ < M n < M+. 

(4.62) 

Clearly, 

= max (F x (B%, t) A Fy (B Y n , t)), 

t eCn 

(4.63) 

where now 



ci = |t = 

{ti)i<i<m-i e [0, l ] m_1 : V i = 1,..., m «■ 1, tj < Pi ± 2^h l . 

j= i n J 

(4.64) 

Next, 

p (M n <x)< p({A4 < x} n + P(F n c ) 



<P({M-<x}nF n )+P(F n c ) 

< P (M~ < x) +P(F“), 


therefore 

limsupP(M n < x) < limsupP(M“ < x). 

n— >-+oo n— >-+oo 

(4.65) 

Similarly, 

P (M n <x)= P ({M n < x} n F n ^j + P ({M n < x} n F^j 



>p({M+ <x}nF n ) 
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>P(M+<i)-P(F=), 


and therefore 

liminf P(M„ < x) > lim inf P(A-f r | < x). (4.66) 

n —^+oo n—>-+oo 

Combining (4.65) and (4.66) with the second elementary fact described above, onr goal is 
now to show that the convergence in distribution of both M+ and M~ towards 

Moo = max ( F X {B X , t) A F Y (B Y , t)) , (4.67) 

holds true, where 

V := V(pi, • • • ,p m - i) = 11 = (tj)i<j< m -i e [0, l] m_1 :Vi = l,...,m-l,y^t j <pj 

{ 3 =1 



To do so, first note that by Donsker’s theorem (B x , B\ ) =>■ (B x , B y ) and we now wish to 
apply the first elementary fact, recalled above, to the functions 

ft ( u , v ) = m ax ( F x (u, t) A F Y (v, t)), (4.68) 

t eC± 

and 

foo (u, v) = max (F Y (u,t) A T>(v,t)). (4.69) 


With these notations, Mf; = ft(B x , B\) and M ^ = /oo(-B x ,-B 5 ). In other words, we 
wish to show that (u n ,v n ) —> (u, v) in (CoQO, l])) m_1 implies that / n (u n , v n ) —> / 00 (u, v). 
To start with, 


l/n(Un,V„) -/oo(u,v)| < |/,^(u n ,V n ) -/±(ll,v)| + |/±(ll,v) -/oo(u,v)|, 
and we continue by estimating |/±(u n , v n ) — /±(u, v)|. But, 

\ft (Un,v„) - ft (u, v) | 

< max (f x (u n , t) A F Y (v n , t) ) - (f x (u, t) A F y (v, t) ) 

t(zCn 

< max max ( \F X (u„, t) - F x (u, t) |, | F Y (v n , t) - F Y (v, t) | ) 

teCn V / 

< cmaxmax ( |u„(t) — u(t)| , |v n (t) — v(t)| ), 


(4.70) 


(4.71) 

(4.72) 


making use of Lemma 4.1 in (4.71), and by the linearity of both F x and F Yl with respect to 
their first argument, in (4.72) and where, further, c is a finite positive constant (depending 
explicitly on m). Therefore, 

l/n (Un, V„) - ft( U, v)| < C max (||u n - U^, ||v n - V^), 
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and so if (u n , v n ) ->• (u, v), it follows that f±( u n , v n ) - f±( u, v) 0. 

In order to complete the proof of =>- and thus that of M n let us now 

estimate the right-most expression in (4.70). 

At first, note that C~ C V C C+, hence 

/n (u,v) < /oo(u,v) < /+(u,v). (4.73) 

Next, via (4.68) and (4.69), set /+(u, v) = max tgC + 0 U)V (t), and /oo(u,v) = max te v 0 u , v (t), 
where 0 U)V (t) = Fx(u, t) AFy(v,t). Since C~ C C“ +1 , for n > 1, it follows (as shown next) 
that /“( u,v) -> max teU ^ iC - 0 U)V (t). Indeed, lim„_, +00 /“ (u, v) < max teU ^ iC - 0 U)V (t) 

and if the previous inequality were strict, there would now be K e (0, +oo) such that 

max$ U)V (t) < K < max $ u ,v(t). 

The left-hand side inequality implies that for all n > 1, and t G C“, 0 U)V (t) < K, contra¬ 
dicting the right-hand side inequality. 

Since C+ D C+ +1 , for n > 1, it also follows that /+(u, v) —>• max tg p >iC + 0 U)V (t). Indeed, 
we have lim ?w+0O /+(u, v) > max tg p c - 0 U)V (t) and if the previous inequality were strict, 
there would be K e (0, +oo) such that 

max6 l u . v (t) > K > max 0 u , v (t). 
t eC+ ’ *-^C\ n >i c n 

The left-hand side inequality implies that for any n > 1, there exists t n G C+ with 
0 u ,v(t n ) A K■ Up to a subsequence t n —> t* e flra>i^n an d by the continuity of 0 U V , 
0 u ,v(t*) > K, which is inconsistent with the previous right-hand side inequality. 

Finally, since {J n>1 C~ = V°, the interior of V, and since H n >i = V = V, the closure of 
V, we have 

lim f~ (u, v) = max0 U v (t) < /oo(u,v) = max6> U)V (t) = lim /+( u,v). (4.74) 

n— >+oo tev° tev n— H-oc 

It remains to show that the maximum of 9 UV on V is attained on V° for P(BX >B y\-almost 
all (u, v), i.e., that 


P( max 9 B x B y(t) 

ytgV(l/m,...,l/m)° ’ 


max 

tsV(l/m,...,l/m) 


9B X ,B Y ( t) 


1. 


(4.75) 


With (4.75), (4.74) entails lim n ^ +00 (u, v) = / QO (u,v) for P( B x B y)-almost all (u, v), i.e., 
the right-most expression in (4.70) converges to 0 and, as previously explained, this gives 
=> Moo and M n => M 0Q . 

In order to complete (4.75) we anticipate, in the second equality below, on the results of 
Section 4.4 in which parameters are changed via: Si = Ui, si+s 2 = u 2 , ■ ■ ■, Si + - • = 

u m _i and where we prove that 


{0b x ,b y ( t)) 


t£V(l/m,...,l/m) 


{8b x ,b y ( s )) 


sev(i,...,i) ^2 rn 


($Bi,b 2 ( u )) 


ueWm(i)’ 
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where W m (l) = {0 = u 0 < ui < ■ ■ ■ < u m _ x < u m = 1}, 


0b u b 2 ( u ) ~ 



XX’W + E --BfWi)) (4-76) 

i=1 i =1 / 


A 



m m 

B?( 1 ) + Y. 

1=1 i=1 



- B.f 


{Ui- 1) 


? 


and with B\ and Bo two independent, standard, m-dimensional Brownian on [0,1]. The 
property (4.75) is thus equivalent to 


P max Ob, b 5 ( u ) = max 0 B , s „(u) ] = 1. (4.77) 

\uew m (i)° ’ uew m (i) ’ J 

The advantage of (4.77) over (4.75) is that the former involves two standard Brownian 
motions each one having independent coordinates. Roughly speaking, the property (4.77) 
should be derived from the following observation: when u G <9W m (l), then Uk = u k+ 1 , for 

some index k, and for such a u, the sum X^=i {b[ l \ u i) ~ B^\ui~ 1 )) contains only m — 1 

terms. Letting u e be given by 

ue,i — Wj; ( / k 1, and u £ ^ k ^.\ — uk T £, 

we have 

m 

Y i B l \u £}i ) ~ {Ue,i-l)) 

1=1 

m 

= Y (B?{ui) ~ B®{ui- 1 )) + {B[ k+1 \u k + £)- B {k+1 \u k )) 

1=1 

+ (B[ k+2 \u k )-B[ k+2 \u k + £)). 

The terms [B[ k+1 \u k +e) — B[ k+1 \u k )) and (B[ k+2 \u k ) — B[ k+2 \u k + s)) are independent 
of £r=i (b? (■ Ui ) — Bi\v,i- 1 )) and from standard properties of Brownian motion, almost 
surely, the sum (B[ k+1 \u k + e) — B[ k+1 \u k )) + [B^ +2 \u k ) — B[ k+2 \u k + e)) takes positive 
value for arbitrarily small e > 0. Since the same is true for the second term in (4.76) 
relative to B 2 , it follows that in the vicinity of each u e <9W m (l), there is u e e W m ( 1) 
with 9b u b 2 ( u e) > 9b\,b 2 ( u )■ Therefore, max ue w m (i) ^ b B 2 (u) is attained in W m ( 1)°, and 
so both (4.77) and (4.75) hold true, leading to M n =>■ M^. 


4.4 Final Step: A Linear Transformation 


By combining the results of the previous three subsections, we proved that 

771—1 / _, \ m— 1 


LCI n — n/m 
\/ 2 n 


1 

max mm | — 

, m 




Yb (i) ’ x 


i=l 


m 


3 = 1 


i =1 
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4.78) 


1 TO — 1 / 1 \ TO—1 / / « \ / i—1 

^ E‘i-^- y 

»= 1 V 7 1=1 \ \jr = l / \j = l 

where the maximum is taken over t = (ti, ..., t m _ i) G V(l/m,..., 1/m). Now, via the 
linear transformations of the parameters given by s* = m i i = 1,..., m — 1, So = 
fo = 0, and Brownian scaling, the right-hand side of (4.78) becomes equal, in law, to: 


771—1 


771—1 


— max min — V 1) - V (B^’ x ( Si ) - B^’ x ( Si _ N) , 

v /m o=s 0 <si<-<s m _i<i \m ^ V / V / V 

\ 2=1 2=1 


m —1 


m—1 


m 


Y 1) - ^ (5 (i) - y ( Si ) - £«’ y (si_i)) • (4.79) 


2=1 


2=1 


Next, for all t e [0,1] and i = 1,..., m — 1, let us introduce the following two pointwise 
linear transformations: 


„r.u m -B'-'it) 

B {t) = - T2 -’ 

p m.r m B { r\t)-Bf(t) 

B {t) = - Ti -’ 

where Bi and B 2 are two, standard, m-dimensional Brownian motion on [0,1]. Clearly 
(B^ ,x (t ),..., -B( m-1 )’ x (t)) 0 <t<i has the correct covariance matrix (4.13), and similarly for 
B 2l replacing X by Y. Moreover, 


>(*)/ 


1 

m 


771—1 


771—1 


2=1 


Y Bii),x iX) - Y i B{i) ’ x (si) - b^ ,x (si-i)) 
2=1 

vk(P' m ) + T2 Bir)m 


2=1 
771—1 


772—1 


-| lie/ X -j 

- k”b.-i)) + /jEK m - d‘V.) 


1 I 1 


2=1 


772—1 


(--Edhi) + (d m) (i) - B!"b m -i))+E( s ®( s i) - d'E-i))) -(4.80) 

Finally, with the help of (4.80) (and the corresponding identity for Y), (4.79) becomes: 


max 


mm- 

777, 0=so<si<—<s m _i<s m =l \ m 


~ Ekhi) +E KY) - Ebi-ii) . 


2=1 

772 


2=1 


>( 0 / 


>(*) i 


~i~ E + E CfEi) - S«( Si _i)) ) , ( 4 , 81 ) 


m 


i =1 


i=l 


>0 


(<)/ 


and the proof of Theorem 1.1 is over. 
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5 Concluding Remarks 


Let us discuss below some potential extensions to Theorem 1.1 and some questions we 
believe are of interest. 


• From the proof presented above, the passage from two to three or more sequences is 
clear: the minimum over two Brownian functionals becomes a minimum over three or more 
Brownian functionals, and such a passage applies to the cases touched upon below. 

• It is also clear from the proof developed above, that a theorem for two sequences of 

iid (non-uniform) random variables is also valid. Here is what it should look like: Let 
X = and Y = (L))j>i be two sequences of iid random variables with values in 

Am = (qi < ot 2 < ■ ■ ■ < a m }, a totally ordered finite alphabet of cardinality m and with 

a common law, i.e., Xi = Yi. Let p max — max PpG = aj and let k be the multiplicity 

2 = 1 , 2 ,. ..,m 

ofpmax- Then, 


LCI n - np n 


y/Wv 


max mm 

O=to<ti<---<ti e -i<tk = l 


Vi kp max l \ ' (j) 


E^U) + Y,A%) 


i=l 


Vi kp max i v ' p) 


i= 1 
k 


E^h)+E^’w 

1=1 


i=l 


BfXtt-,)), 


where B\ and B 2 are two fc-dimensional standard Brownian motions defined on [0,1]. So, 
for instance, if p max is uniquely attained then the limiting law in (5.1) is the minimum of 
two centered Gaussian random variables. 

Using the sandwiching techniques developed in [HL], an infinite countable alphabet 
result can also be obtained with (5.1). 

• The loss of independence inside the sequences, and the loss of identical distributions, 
both within and between the sequences is more challenging. Results for these situations 
will be presented elsewhere. 

• The length of the longest increasing subsequence of a random word is well known 

to have an equivalent interpretation in percolation theory: Indeed, consider the following 
directed last-passage percolation model in Z+: let n 2 (n,m) be the set of directed paths 
in from (0,0) to (■ n,m ) with unit steps going either North or East. Given random 
variables ayj, i > 0, j > 1, and interpreting each as the length of time spent by a path 
at the vertex the last-passage time to (n, m) is given by 

T 2 (n, m) = max ). (5.2) 

7ren 2 (n,m) \ ^ ' / 

\ (i,j )&r / 


(See Bodineau and Martin [BM], and the references therein, for details.) In our random 
word context, when X = (Xj)i<j<„ is a sequence of iid random variables taking their values 
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in a totally ordered finite alphabet {au < a 2 < • • • < a m } of size m, taking = 1^=^} 
and ojqj — 0, j > 1, which for each i are dependent random variables, the length of 
the longest increasing subsequence of the random word is equal to the last passage-time 
T 2 (n, m), see [BH]. 

Now LCI n , the length of the longest common and increasing subsequences, enjoys a 
similar percolation theory interpretation, but in Z+. Let 113 ( 71 , n, m) be the set of paths 
in Z+ from (0, 0, 0) to (n, n, m ) taking either unit steps towards the top or steps, of any 
length, in the horizontal plane but neither parallel to the rc-axis nor to the p-axis, i.e., 

II 3 (n,n,m) := {(«i,u 2 , • • ■ ,u n+m ) G £ l? + ) n+m : u x = (0,0,1 ),u n+m = (n,n,m), 

Uj .|_i — Uj G {(0, 0,1), (a, 6,0) with a, b G N \ {0}}, j = 1, • • • , n + m — l|. 

Given weights uiij t k, i > 0, j > 0, k > 1, on the lattice, we can consider a quantity analogous 
to T 2 (n, m) in (5.2), namely, 

T 3 (n,n,m ) := max uj i jk 

iren 3 (n,n,m) \ 

\ (ij,fc)e7r 



In the random word context, taking = l{x l= Y 3 =a k } an d w o,o,fc = 0, k > 1, as weights, 
gives LCI n = T 3 (n,n,m). 

Note that when A" = Y, T 3 (n,n,m ) recovers T 2 (n, m) since T 2 (n, rn) is unchanged if, in 
(5.2), n 2 (n, m) is replaced by 

n 2 (n,m) := |(iti, u 2 , ■ ■ ■, u n+m ) G ( Z 2 + ) n+m : u x = (0,1 ),u n+m = (n,m), 

Uj + 1 — Uj G {(0,1), (a, 5) with a, b G N \ {0}}, j = 1,... , n + m — 1 j. 


More generally, for p > 3 sequences of letters = (A" 4 (f) ) 1 < i < n , 1 < £ < p, we can 
similarly consider 

n p+ i(r7,... ,m) := |(mi,m 2 , ■ ■ -,u n+m ) G (Z*J_ +1 ) n+r ? ui = (0,..., 0, l),u n+m = (n,...,n,m ) 
u i+ i - Uj G {(0,,... ,0,1), (di,... ,a p ,0) with di G N\{0}},j = 1 ,...,n + m- l|, 


and 


T p (n,...,n,m) := 


max 





.,i p ,k 


Then, observe that LCI n , for the p sequences, is equal to T p (n,... ,n,m), where now 
Ui lr .. >ip>k = 1 {x h = ^x ip = ak } and coo,...,o,k = 0 , k = 1 ,..., m, are dependent random variables. 

In view of Theorem 1.1 and of [BM], one would expect that for m fixed and for expo¬ 
nential mean one iid weights ay. r , T 3 (?r, n, m) converges, when properly centered, by n, and 
scaled, by a /n, towards 


max min 

0=to<tl<-<tm— l<tm=l 



B?(t (it) 

i=l 
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with also the trivial modification for T p . 

• Starting with Baryshnikov [Bar] and Gravner, Tracy and Widorn [GTW] (see, also 
[BGH], for a further description and up to date references) a strong interaction has been 
shown to exist between Brownian functionals, originating in queuing theory with Glynn 
and Whitt [GW] (see also Seppalainen [Sep]), and maximal eigenvalues of Gaussian random 
matrices. Likewise, we hypothesize that the max/min functionals obtained here do enjoy 
a similar strong connection (which might extend to spectra and Young diagrams). Could 
it be that the right-hand side of (1.1) (with or without the linear terms) has the same law 
as the maximal eigenvalue of a random matrix model? Even in the binary case, it would 
be interesting to find the law of the processes (a/2 max 0 <t<i min(Ri(t) — T>i(l)/2, B 2 (t) — 
B 2 (l)/2)) and (max 0 <t<i min(T>i(t), B 2 (t))) where, say, B\ and B 2 are two inde¬ 
pendent standard linear Brownian motions. Very preliminary work on these problems was 
started with Marc Yor, before his untimely death, and this text is dedicated to his memory. 

• To finish, note that the LCIS problem for two or more uniform random permutations 
of (1,2, ... ,n} has not been studied either, although it certainly deserves to be. In point 
of fact, it is shown in [HI] that, for any two independent uniform random permutations (7\ 
and (j 2 of (1,2 ,,n} , and for any x G M, F(LC n (ai, cr 2 ) < x) = ¥(LI n (ai) < x), where 
LI n (ai) is the length of the longest increasing subsequences of cq. Therefore, this equality 
in law shows the emergence of the Tracy-Widom distribution, which had sometimes been 
speculated, as the corresponding limiting law. Indeed, once we are given the result of Baik, 
Deift and Johansson [BDJ] on the limiting law of LI n (ai), a corresponding result (actually 
equivalent to it) for LC n (c r i, cr 2 ) is immediate. In fact, many of the results on LI n (ai) 
presented in Rornik [Rom], such as the law of large numbers of Vershik and Kerov [VK] 
are instantaneously transferable to equivalent versions for LC n (ai, a 2 ). 

Moreover, for p > 3 independent and uniform random permutations cq, cr 2 ,.. ., cr p , the 

methodology developed in [HI] easily shows that LC n (ai, <t 2 , ..., a p ) = LCI n (cri, ..., cr p _i), 

where = denotes equality in distribution. Therefore, the study of longest common and in¬ 
creasing subsequences in random words or random permutations which might appear, at 
first, quite artificial is actually intimately related to the study of longest common subse¬ 
quences. 


A Appendix 

A.l Proofs of technical lemmas 

Proof of Lemma 4.1 

First, 

ma x k =i,...,K (afe A b k ) - max^i,...^ {(a k + c k ) A (b k + d k )) 

< maxj. =1 . | (a k A b k ) - (( a k + c k ) A (, b k + d k )) 
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Next, the result will follows from the elementary inequality 


(a A b) — (a + c) A (b + d) < |c| V |d|, (A.l) 

which is valid for all a, b, c, d G M. Indeed, set D = (a A b) — (a + c) A {b + d) and assume 
(without loss of generality) that a < b. If a + c < b + d, then D = a — (a + c) = —c < |c|. 
If b + d < a + c, then D = a — b — d and so whenever a < b + d, (A.l) is immediate, while 
if a > b + d, then D = a — b — d < —d = |d| since a — b < 0 and — d > b — a > 0. □ 


Proof of Lemma 4.2 

Let D n = { |Ah n ) — E[lV (n )]| < x n }, and for e > 0, let 


Ai(£) — | Ylj 


? -e[JV(™),E[iv(")]] Jh — £ ' , ‘ 


Since P(A n (e)) < P(A n (e) fl D n ) + P(D£), and since lim^oo P(L>£) = 0. and it is enough 
to show lim n _>. +00 P(A ri (e) fl D n ) = 0. But, by Kolmogorov’s maximal inequality, 


P (A n (e)nh n ) < P max 

1 |fc-E[Ah")][<x„ 


£ 

je[fc,E[JV(")]] 


> 5 


< X “ Vaj(Z 1 ) -4 0 . 


e 2 n 


□ 


Proof of Lemma 4.3 

First, we show that (B^ (iVfc/n) 2 ) is uniformly integrable. Proposition 3.2 and Re¬ 
mark 3.1 give 


E 


r>(k) ( Afc 

n \ n ) 

B (k) p ] < 2 p- i ( E 
n 


+ op(1/V^ 

fc r | / —p/2\ 

+ o[n p/ ) 




= 2" 1/2 n" p/2 E 


V2n 

\N m - N k H + o(rT^ 2 ) 


But lV m — N k = Yli=i e i m ’ k ' > where (e 4 - m,fc ' ) )i>i are iid with ef 1 '^ = 1 when X\ = a m , 
e (™,A) _ w j ien _ a ^ anc j e (m,k) _ g 0 H ie rwise. Hence, by the classical Marcinkiewicz- 
Zygmund inequality, sor some constant C p , 


E 


\N m -N k \ p 


E [|Z)ei 


pi 


i =1 
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Therefore, for any p > 2 


< c r E[(^k 


( m,k ) 12 


p/ 2l 


i=l 


< C p n p/2 . 


sup E 

n> 1 


*•(£) 


pi 


< +CX) 


and {Bn\N k /n) 2 ^ n>1 is uniformly integrable. Next, for {Bn^ (l/m)) 


n>l' 


i i t n / m l / r i\ 

R (fc)fl\ = J_ V 7 (fc) 4- P 1 M 7 
- \ m ) ^ i 0^ 


(k) 

n [n/m]+ 1" 


and 


E 


R { , fc) f—VI < 2 p_1 n _p/2 E 


m 


1 - [n/m] 


E z . 


(*) 


< C fl n- p/2 E 


J=1 
[n/m] 

Ei*. 

i=i 


( fc ) |2 


+ 2 p_1 n _p/2 E 

p/2l 

+ 2 p - 1 n" p/2 E 


7(fc) 


n/m] +1 


PI 


Z\ 


\k) 


using again the Marcinkiewicz-Zygmund inequality. Continuing, using convexity, 


C p 7l~ p/2 E 


r [n/m] 


E i z 


Ml 2 


J=1 

Hence, for any p > 2. 


P/2 


< C p n- p/2 E 


[n/m] 


[n/m] P//2 1 |Z 


(fc)i 


sup E 

n> 1 




m 


i=i 


< +00, 


C p ^ 
< — 
m pi- 


|^(fc) ip 


and (i?i fc /l/m) 2 ) n>1 is uniformly integrable and therefore, from above, so is ( Bn\Nk/n ) — 
Bn\l/m)) 2 . Finally, in order to show (4.17), it is enough to prove 




n —>• +cx). 


(A.2) 


Setting A n = {\Nk—n/m\ < y/nh\n}, Hoeffding’s inequality ensures that lim^+oo P(A£) = 0. 
Therefore, since 


AT 1 1 Nk 


j=[n/m]+l 


we have 


P (| B "‘ ) (t) _ jB ”'‘ ) (^)|- £ ) - P ({l E Zf|S £ VS} n£l »)+ P 7n). 

k j=[n/m]+l p 
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and Kolmogorov’s maximal inequality entails that 


P 


/ 

i 

\ ^ 

r- n/m+y/n Inn -i 

( max 

E z f 

> E\fn ) < —z— E 

E ( z f) 2 

\ l£[n/m—y/^\nn,n/m+y/n\nn] 

j=[n/m]+1 

J e 2 n 

L j=[n/m\+1 


< 


(ln? 2 )E[(zf ) ) 2 ] 


n 


finishing the proof of (A.2) and thus of (4.17). 

Proof of Lemma 4.4 

Consider three cases: |x -C n, x ^ n (here u n <C v n means lim n _>. +oc u n /v n = 0) and x ~ n, 
i.e., cqn < x < C 2 n, for two finite constants c\ and C 2 , and expand K n (x) accordingly. First, 
let |x| -C n: then, 


K n (x) = 


(2 n) x+2n (1 + i^) x+2n 
{2n) x+n (2n) n (1 + ^) x+n 

exp ^(x + 2 n) In ^1 + — (x + n) In ^1 + — ^ 


( , rjf » / ry, 2 

x 2 3x 3 f x 3 \ f x 2 
eXp| -fc + W +0 U5 )+“ - 


(Y> / /y»2 

t •. 1 tXj tAs I 

(x + n) K-2^ + °U 


= exp 


x 2 (x 2 

- -b O ( — 

4n \ n 


which yields (4.33) in case |x| <C n. Next, let x n: then, 
(x + 2 n) x+2n x n (1 + f ) x+2n 


K n {x ) = 


(2x + 2n) x+n (2u) n (4n) n 2- T (1 + ^) x+n 

(4ny*2* eXp ( (X + 2n) 0 + I) - (X + n) 0 + ;)) 

x n (, . (2n 2n 2 f n 2 \\ , N / n n 2 


(An) n 2 x ~ ^ V v ' ’ V x x 2 \x 2 )) V x 2x 2 ' \ x 2 


exp ((x + 2n) ( — — + o ( — )) — (x + n)\ — — —— + o [ — 


n 


x 


3 n 2 7n 3 / rr 


(4 n )»2» eXp P + 27-2? + °fx 

f 3 n 2 , ( x \ , ^ f n 2 

exp I n + —-h nm I — 1 - ilnz + o — 

\ ZjOC \ “t77// \ (Z/ 


(A.3) 


Since x n, the larger order in the exponential (A.3) is re In 2 and, this recover a bound 
of the form (4.33) in this case. Finally, consider the case x ~ n, say x = an with a > —1. 
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Then, 


K n (x) = 


((a + 2 )n) 


(a-\-2)n 


((2a + 2)n) ( “ +1)n (2n) 


= exp ( — c(a)n), 


which is again of the form (4.33), since c(a) = In (2(2a + 2) a+1 /(a + 2) a+2 ) is positive for 
all a > — 1 and is also bounded. □ 


A.2 On [HLM] 


The purpose of this Appendix is to provide some missing steps in the proof of the main 
theorem in [HLM] devoted to the binary case as well as to correct the errors present there. 
The notations and numbering are as in [HLM], In particular, recall that Ni (resp. N 2 ) is 
the number of zeros in Xi,, X n (resp. Yj,..., Y n ). 

Proof of (13). Recall again from [HLM] that 


K. = 


AT = 


max 

0<k<N 1 AN 2 


D +S *( k - 


(a (-B 


n 


Clearly, 


An > 


A(-^6) + ^@) 


-2 A ^ (l 
2 — 1,2 v 


(A.4) 


and denote by A the index for which the minimum in (A.4) is attained. 

Next, if Ai A A 2 < n/2, then V n < X n ; and similarly if the maximum defining V n is 
attained at some k* < n/2, then V n < X n . Otherwise, Ni A N 2 > n/2 with, moreover, the 
maximum defining V n attained at k* G [n/2, AA A N 2 ] and so: 


V n 





Now, via (A.4), 


V n ~X n < 

< 
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< 


< 



Inequality (A.6) replacing (15) of [HLM] and its proof. If N\ AN 2 > n/2, then X n < 
V n and similarly if the maximum defining X n is attained for some t < (.Ni A N 2 )/n, then 
X n = V n . Therefore, the remaining case in comparing X n and V n consists in N 1 AN 2 < n/2 
and a maximum defining X n attained at some t* G [(N\ A AA)/n, 1/2]. In this case, 


X„ 



+ Bf(f 


1 


and 


K - A H 5 -’ G) + s ' ( 


Ni A N 2 


n 


(A.5) 


Again, denote by A the index for which the minimum in (A.5) is attained. Then, 

A 1 A N 2 


X n — V n < 


A (5) + § X')) - A (-5^ (5) + A" ( 

s (-^b> (0 + - (~\sa g) + §A ( 


n 


Ni A N 2 


= b£-\?)~b^ 


N\ A N 2 


n 


< 


< 


fe 


max 

- N 1 AN 2 1 ] 


V 


max 


• fr \N 1 AN 2 11 
2 =1,2 L n ’ 2 J 


Bt>(t) ~ -§<-> 


B®(«) - Bf 


N\ A A^2 
n 

N\ A A^2 
n 


(A.6) 


Since (15) of [HLM] has to be replaced by (A.6), instead of (16) of [HLM], we now have to 
prove that for i = 1,2: 


te 


max 

N 1 AN 2 1 ] 

n ’2 




(0 


A A^2 


n 


0. 


(A.7) 


The difference with (16) of [HLM] is that N, therein, is now replaced by Ni A N 2 which is 
now more complex since one of the two quantities Ni or N 2 is not independent of B n . To 
prove (A.7), and so as not to further burden the notation, the superscript i in the Brownian 
approximation bA is dropped. First, let 


Cl 


N\ 


n 

2 
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and, in a similar fashion, define C% by replacing N l with N 2 . Clearly, lim n _>. +00 P((C^) C ) = 
lim n _> +0O P((C^) C ) = 0. Next, for e > 0, let 


A n = < max 


iS 


yi/xAy i] 


B n (t) - B n 


Ni A N 2 


n 


> £ 


Then, 


PW < P(A,nc;nc3+p((c;r)+p((c3«), (A.s) 

and since on C\ (resp. Cl), Ni > n/2 — \Jri In n (resp. N 2 > n/2 — \Jri In n ), 

J , (A.9) 

where the random variables are iid with mean zero and variance one and assuming that 
n/2 — \JTi In n and n/2 are integers (if not replace throughout, the first value by its integer 
part and the second by its integer part plus one). To deal with (A.9), first note that on 
C£nC£, N\ A N 2 e [| — y/n,lnn, n] , the right-hand f side of (A.9) is clearly upper-bounded 
by 


P(Ai n C/ n Cl) < 



max 

fc=^ — x/nlnn,...,^ 


Y & 

j=N 1 KN 2 


> eV2n \ n cl n cl 


p 


max 

^ In n<£<k< ^ 


k 

x> 

j=e 


>ej^\rc l n rcl 


< P I < max 

|-^/nlnn<K| 

8 In n 

< -j-p, 

£ \/n 


n/2 

Y& 

j=k 


>lC)nC^Cl 


where the inequality in (A. 10) follows from the bound 


max 

^ — y/n In n<£<k< ^ 




3=*- 


< max 

Tj — y/nln n<£<k< ^ 


< 2 max 

^ — \/n In n<k< ^ 


n/2 

j=k 


+ 


n/2 

Y & 

j=£ 


n/2 

Y & 

j=k 


(A.10) 
(A.11) 


while the one in (A. 11) is Kolmogorov’s maximal inequality. Therefore, the right-hand side 
of (A.9) converges to zero, finishing, via (A. 8), the proof of (A.7). □ 


46 





















Acknowledgments 

This research is supported in part by the grant ^246283 from the Simons Foundation and 
by a Simons Foundation Fellowship grant ^267336. The first author thanks the School of 
Mathematics of the Georgia Institute of Technology for several visits during which part of 
this work was done. The second author thanks the Centre Henri Lebesgne of the Universite 
de Rennes 1, the Departement MAS of Ecole Centrale Paris, the LPMA of the Universite 
Pierre et Marie Curie and CIMAT, Gto, Mexico for their hospitality, while this work was 
in progress. Both author thank an anonymous referee for valuable comments which helped 
to improve this manuscript. 

References 

[BDJ] J. Baik, P. Deift, and K. Johansson. On the distribution of the length of the longest increasing subsequence of random 
permutations. J. Amer. Math. Soc., 12(4), pp. 1119-1178, 1999. 

[Bar] Y. Baryshnikov. GUEs and queues , Probab. Theory Relat. Fields vol. 119, pp. 256-274, 2001. 

[BGH] F. Benaych-Georges, C. Houdre. GUE minors, maximal Brownian functionals and longest increasing subsequences 
in random words. Markov Processes Relat. Fields, vol. 21, pp. 109-126, 2015. 

[Bil] P. Billingsley. Convergence of probability measures , Wiley series in Probability and Statistcics, 2nd Edition, 1999. 

[BM] T. Bodineau, J. Martin. A universality property for last-passage percolation paths close to the axis. 
Elec. Comm. Prob. vol. 10, pp. 105-112, 2005 

[BH] J.-C. Breton, C. Houdre. Simultaneous asymptotics for the shape of random Young tableaux with growingly reshuffled 
alphabets. Bernoulli, vol. 16, no. 2, pp. 471-492, 2010. 

[CZFYZ] W.T. Chan, Y. Zhang, S.P.Y. Fung, D. Ye and H. Zhu. Efficient algorithms for finding a longest common increasing 
subsequence. Lecture Notes in Comput. Sci., vol. 3827, Springer, Berlin, pp. 655-674, 2005. 

[DKFPWS] A.L. Delcher, S. Kasif, R.D. Fleischmann, J. Peterson, O. White and S.L. Salzberg. Aligment of whole genomes. 
Nucleic Acids Research, vol. 27, no. 11, pp. 2369-2376, 1999. 

[GW] P. W. Glynn, W. Whitt. Departure from many queues in series. Ann. Appl. Probab., 1(4), pp. 546-572, 1991. 

[GTW] J. Gravner, C. A. Tracy, H. Widom. Limit theorems for height fluctuations in a class of discrete space and time 
growth models, J. Stat. Phys. vol. 102, pp. 1085-1132, 2001. 

[HI] C. Houdre and U. Islak. A central limit theorem for the length of the longest common subsequences in random words. 
ArXiv #Math.PR/1408.1559v3, 2015. 

[HLM] C. Houdre, J. Lember, H. Maztinger. On the longest common increasing binary subsequence. C.R. Acad. Sci., Paris 
Ser. I, vol. 343, pp. 589-594, 2006. 

[HL] C. Houdre, T. Litherland. On the longest increasing subsequence for finite and countable alphabets, in High Dimensional 
Probability V: The Luminy Volume (Beachwood, Ohio, USA: Institute of Mathematical Statistics), pp. 185-212, 2009. 

[HX] C. Houdre, H. Xu. On the limiting shape of Young diagrams associated with inhomogeneous random words, in: High 
Dimensional Probability VI: The Banff volume Progress in Probability, 66, Birkhauser, pp. 277-302, 2013. 

[ITW1] A. Its, C. A. Tracy, H. Widom. Random words, Toeplitz determinants, and integrable systems. I. Random matrix 
models and their applications, pp. 245—258, Math. Sci. Res. Inst. Publ., vol. 40, Cambridge Univ. Press, Cambridge, 
2001 . 

[ITW2] A. Its, C. A. Tracy, H. Widom. Random words, Toeplitz determinants, and integrable systems. II. Advances in 
nonlinear mathematics and science. Phys. D., vol. 152-153, pp. 199-224, 2001. 


47 



[Joh] K. Johansson. Discrete orthogonal polynomial ensembles and the Plancherel measure. Ann. of Math. (2) 153 (2001), 
no. 1, 259-296. 

[Ker] S. Kerov. Asymptotic Representation Theory of the Symmetric Group and its Applications in Analysis, Vol. 219. AMS, 
Translations of Mathematical Monographs, 2003. (Russian edition: D. Sci thesis, 1994) 

[Rom] D. Romik. The surprising mathematics of longest increasing subsequences. Cambridge University Press, 2014. 

[Sak] Y. Sakai. A linear space algorithm for computing a longest common increasing subsequence. Information Processing 
Letters, vol. 99, pp. 203-207, 2006. 

[Sep] T. Seppalainen. A scaling limit for queues in series. Ann. Appl. Probab., 7(4), pp. 855-872, 1997. 

[TW] C. A. Tracy, H. Widom. On the distribution of the lengths of the longest increasing monotone subsequences in random 
words. Probab. Theor. Rel. Fields, vol. 119, pp. 350-380, 2001. 

[VK] A. M. Vershik, S. V. Kerov. Asymptotic behavior of the Plancherel measure of the symmetric group and the limit form 
of Young tableaux. Soviet Math. Dokl. (English translation), 233(1-6): pp. 527-531, 1977. 


48 



